Skip to main content

CMB-GEC

The Cancer Imaging Archive

CMB-GEC | Cancer Moonshot Biobank - Gastroesophageal Cancer Collection

DOI: 10.7937/E7KH-R486 | Data Citation Required | 1k Views | Image Collection

Location Species Subjects Data Types Cancer Types Size Supporting Data Status Updated
Esophagus Human 12 CT, PT, MR, Histopathology Gastroesophageal Cancer 24.42GB Clinical Public, Ongoing 2024/08/28

Summary

The Cancer Moonshot Biobank is a National Cancer Institute initiative to support current and future investigations into drug resistance and sensitivity and other NCI-sponsored cancer research initiatives, with an aim of improving researchers' understanding of cancer and how to intervene in cancer initiation and progression. During the course of this study, biospecimens (blood and tissue removed during medical procedures) and associated data will be collected longitudinally from at least 1000 patients across at least 10 cancer types, who represent the demographic diversity of the U.S. and receiving standard of care cancer treatment at multiple NCI Community Oncology Research Program (NCORP) sites.

This collection contains de-identified radiology and histopathology imaging procured from subjects in NCI’s Cancer Moonshot Biobank - Gastroesophageal Cancer (CMB-GEC) cohort. Associated genomic, phenotypic and clinical data will be hosted by The Database of Genotypes and Phenotypes (dbGaP) and other NCI databases.  A summary of Cancer Moonshot Biobank imaging efforts can be found on the Cancer Moonshot Biobank Imaging page.

Data Access

Version 5: Updated 2024/08/28

Additional data for collection.

Title Data Type Format Access Points Subjects Studies Series Images License
Images CT, PT, MR DICOM
Download requires NBIA Data Retriever
5 22 124 22,321 CC BY 4.0
Images of the head (see Restricted License) CT, PT DICOM
Download requires NBIA Data Retriever
5 10 30 5,733 TCIA Restricted
Tissue Slide Images, Pathology Metadata Histopathology JSON and SVS
Download requires IBM-Aspera-Connect plugin
12 16 CC BY 4.0
Related Datasets
No related Analysis Results found: Submit your proposal! No related Collections found
Legend: Analysis Results| Collections

Additional Resources for this Dataset

The database of Genotypes and Phenotypes (dbGaP) hosts genomic, phenotypic, and clinical data for NCI’s Cancer Moonshot Biobank (CMB) project. Information and access to the data can be found at:

The NCI Cancer Research Data Commons (CRDC) provides access to additional data and a cloud-based data science infrastructure that connects data sets with analytics tools to allow users to share, integrate, analyze, and visualize cancer research data.

Citations & Data Usage Policy

Data Citation Required: Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution must include the following citation, including the Digital Object Identifier:

Data Citation

Cancer Moonshot Biobank. (2022). Cancer Moonshot Biobank – Gastroesophageal Cancer Collection (CMB-GEC) (Version 5) [Dataset]. The Cancer Imaging Archive. https://doi.org/10.7937/E7KH-R486

Acknowledgement

The Cancer Moonshot Biobank program requests that publications using data from this program include the following statement: “Data used in this publication were generated by the National Cancer Institute Cancer Moonshot Biobank.”

Detailed Description

Introduction

Biobank radiology imaging data on TCIA contains the “days from enrollment (registration)” for each scan, embedded in the DICOM files (DICOM tag (0012,0053)).   This allows for temporal alignment between the imaging on TCIA and clinical events data found on the Biobank Catalog.
Note:  In order that the images display properly in DICOM readers, the radiology imaging data also contains de-identified dates that preserve the temporal sequence relationship between scans in a given study.

Days from enrollment (registration)

In addition to modifying the actual date fields in the DICOM header, the “days from registration” values are calculated and stored in the DICOM tag (0012,0052) Longitudinal Temporal Offset from Event with the associated tag (0012,0053) Longitudinal Temporal Event Type set to “REGISTRATION”.   Here is an example DICOM header from a scan where the patient’s imaging was performed 2 days before the registration, resulting in a negative offset value.

(0012,0052) Longitudinal Temporal Offset from Event -2.0
(0012,0053) Longitudinal Temporal Event Type REGISTRATION

If you would like to filter your search results using this information, you can leverage the “Clinical Trial Time Points” filter  via our data portal at https://nbia.cancerimagingarchive.net/nbia-search/.

De-identification of DICOM dates

De-identification of dates for this dataset uses the DICOM Part 3.15 Annex E standard “Retain Longitudinal With Modified Dates Option” which allows dates to be retained as long as they are modified from the original date.  TCIA implements this using a technique which de-identifies the dates while preserving the longitudinal relationship between them.  Original dates will be first normalized to January 1, 1960 and then offset relative to the date of registration for each patient.  This normalized date system was chosen in order to make it obvious that the dates are not real, and to make it easy to quickly determine how much time has passed between the date of registration and the patients’ related imaging studies.

For example, if the real date of a patient’s registration was 03/27/2018 and the original imaging Study Date was 03/29/2018 then the anonymized TCIA Study Date would become 01/03/1960 (two days after the base date of 1/1/1960).

Related Publications

Publications by the Dataset Authors

The authors recommended the following as the best source of additional information about this dataset:

No other publications were recommended by dataset authors.

Research Community Publications

TCIA maintains a list of publications that leveraged this dataset. If you have a manuscript you’d like to add please contact TCIA’s Helpdesk.

TCIA maintains a list of publications that leveraged this dataset. If you have a manuscript you’d like to add please contact TCIA’s Helpdesk.

Previous Versions

Version 4: Updated 2024/08/04

The Patient ID format was updated to remove the collection prefixes; CMB-GEC-MSB-##### is now MSB-#####.

Title Data Type Format Access Points Subjects Studies Series Images License
Images CT, PT DICOM
Download requires NBIA Data Retriever
1 4 14 1,293 CC BY 4.0
Images of the head (see Restricted License) CT, PT DICOM
Download requires NBIA Data Retriever
1 2 10 2,440 TCIA Restricted
Tissue Slide Images, Pathology Metadata Histopathology JSON and SVS
Download requires IBM-Aspera-Connect plugin
7 11 CC BY 4.0

Version 3: Updated 2024/03/04

Added new patients and slides for histopathology data (6 patients, 8 slides)

Title Data Type Format Access Points Subjects Studies Series Images License
Images CT, PT DICOM
Download requires NBIA Data Retriever
1 4 14 1,293 CC BY 4.0
Images of the head (see Restricted License) CT, PT DICOM
Download requires NBIA Data Retriever
1 2 10 2,440 TCIA Restricted
Tissue Slide Images, Pathology Metadata Histopathology JSON and SVS
Download requires IBM-Aspera-Connect plugin
7 11 CC BY 4.0

Version 2: Updated 2023/12/07

Added TCIA Restricted Data.

Title Data Type Format Access Points Subjects Studies Series Images License
Images CT, PT DICOM
Download requires NBIA Data Retriever
1 4 14 1,293 CC BY 4.0
Images of the head (see Restricted License) CT, PT DICOM
Download requires NBIA Data Retriever
1 2 10 2,440 TCIA Restricted
Tissue Slide Images, Pathology Metadata Histopathology JSON and SVS
Download requires IBM-Aspera-Connect plugin
1 3 CC BY 4.0

Version 1: Updated 2022/08/12

Title Data Type Format Access Points Subjects Studies Series Images License
Images CT, PT DICOM
Download requires NBIA Data Retriever
1 4 14 1,293 CC BY 4.0
Tissue Slide Images, Pathology Metadata Histopathology JSON and SVS
Download requires IBM-Aspera-Connect plugin
1 3 CC BY 4.0