MU-Glioma-Post | University of Missouri Post-operative Glioma Dataset
DOI: 10.7937/7k9k-3c83 | Data Citation Required | Image Collection
Location | Species | Subjects | Data Types | Cancer Types | Size | Status | Updated | |
---|---|---|---|---|---|---|---|---|
Brain | Human | 203 | Segmentation, Demographic, Diagnosis, Molecular Test, Treatment, Follow-Up, Other, Measurement, Radiomic Feature | Brain Cancer | Clinical | Public, Complete | 2025/03/21 |
Summary
This dataset includes MR imaging from 203 glioma patients with 617 different post-treatment MR time points, and tumor segmentations. Clinical data includes patient demographics, genomics, and treatment details. Preprocessing of MR images followed a standardized pipeline with automatic tumor segmentation based on nnUNet deep learning approach. The automatic tumor segmentations were manually validated and refined by neuroradiologists. The heterogeneity of glioma imaging characteristics and management strategies contributes to a lack of reliable findings when evaluating treatment outcomes with conventional MRI, and the overlapping imaging features of radiation necrosis and tumor progression post-treatment can be particularly challenging for radiologists. This robust dataset should contribute to the development of AI models to improve evaluation of treatment outcomes. The dataset consists of institutional review board-approved retrospective analysis of pathologically proven glioma patients at University Hospital of The University of Missouri - Anatomic Pathology CoPathPlus database was used to collect glioma cases over the last 10 years. Sharing segmented postoperative glioma data with clinical information significantly accelerates research and improves clinical practice by providing a comprehensive, readily available dataset. This eliminates the time-consuming burden of manual segmentation, enhances the accuracy and consistency of tumor delineation, and allows researchers to focus on analysis and interpretation, ultimately driving the development of more accurate segmentation algorithms, predictive models for personalized treatment strategies, and improved patient outcome predictions. Standardized longitudinal follow-up and benchmarking capabilities further facilitate multi-center studies and objective evaluation of treatment efficacy, leading to advancements in glioma biology and personalized patient care. The following subsections provide information about how the data were selected, acquired, and prepared for publication. The selection criteria for the CoPath Natural Language II Search included accession dates ranging from 01/01/2021 to 02/20/2024. To ensure all relevant diagnoses for this study were included; three separate keyword searches were performed using "glioma", "astrocytoma", and "glioblastoma". The search only included keyword results that were present in the Final Diagnoses. "Glioma" returned 85 cases; "Astrocytoma" returned 67 cases; and "Glioblastoma" returned 215 cases. Following the exclusion of duplicate cases, those missing any of the four requisite MR imaging sequences, and cases that failed processing through our pipeline, our final cohort comprised 203 patients. Radiology: MRI studies on our McKesson Radiology 12.2 Picture archiving and communication system (PACS) (Change Healthcare Radiology Solutions, Nashville, Tennessee, U.S) were exported. The image exportation process involved multiple personnels of varying ranks, including medical graduates, radiology residents, neuroradiology fellows, and neuroradiologists. Our team exported the four basic conventional MR sequences including T1, T1 with IV gadolinium-based contrast agent administration, T2, and Fluid Attenuated Inversion Recovery (FLAIR) into a HIPPA compliant MU secured research server. For each patient, the images were thoroughly checked for including up to six post-treatment images as available. The post-treatment images were captured on different dates, though not all patients had the maximum number of follow-up images; some had as few as one post-treatment follow-up MRI. For patients with more frequent follow-up MRIs, the immediate post-operative scan, at least one time point of progression and another follow-up study. The MR images were comprehensively reviewed to exclude significantly motion degraded or suboptimal studies. The majority of the studies were conducted using Siemens MRI machines 97.47%, n=579 with a smaller proportion performed on MRI machines from other vendors: GE (2.02%, n=12) and Philips (0.51%, n=3). Table 1 shows the distribution of studies across different Siemens MR machines. Regarding the magnetic field strength, 1.5T MRIs accounted for 48.14% (n=1,126), 3T MRIs accounted for 45.08% (n=318), and 3T MRIs accounted for 45.08% (n=261). Table 2 summarizes the MRI parameters of each MR sequence. Our team made efforts to obtain 3D sequences whenever available. Scans were performed using 3D acquisition methods in 40.28% of cases (n=975) and 2D acquisition methods in 59.82% of cases (n=1,419). In cases where 3D images were not available, 2D images were utilized instead. Table 3 summarizes the counts and percentage of studies performed with 2D vs 3D acquisition across different MR sequences. Clinical: Basic demographic data, clinical data points, and tumor pathology were obtained through review of the electronic medical record (EMR). Clinical data points included the date of diagnosis, date of first surgery or treatment, date and characterization of first and/or subsequent disease progression and/or recurrence, and date of any follow-up resections. Survival information included the date of death and, if that was unknown, the date of last known contact while alive. Disease progression and/or recurrence was characterized as imaging only, clinical only, or both based on information obtained through review of each patient’s clinical notes, brain imaging, and clinical impression as documented by the primary care team. Brief summaries of the reasoning behind each characterization were also included. Patients with no further clinical contact beyond their primary treatment were documented as “lost to follow-up.” Pathological information was obtained through review of the initial pathology note and any subsequent addenda for each tumor sample and included final tumor diagnosis, grade, and any identified genetic mutations. This information was then compiled into a spreadsheet for analysis. The image data underwent preprocessing using the Federated Tumor Segmentation (FeTS) tool. The pipeline began with converting DICOM files to the Neuroimaging Informatics Technology Initiative (NIfTI) format, ensuring the removal of any remaining PHI not eliminated by the anonymization/de-identification tool. The converted NIfTI images were then resampled to an isotropic 1mm³ resolution and co-registered to the standard anatomical human brain atlas, SRI24. A deep learning brain extraction method was applied to strip the skull and extracranial tissues, thereby mitigating any potential facial reconstruction or recognition risks. The preprocessed images were segmented using a deep network based on nnU-Net, resulting in four distinct labels that correspond to different components of each tumor: A spreadsheet is also provided that includes tumor volumes and signal intensity of different tumor components across various MR sequences. Each scan was manually exported using the built-in McKesson DICOM export tool into separate folders labeled as post-treatment 1, post-treatment 2, etc. In a subsequent step, a subset of the data was selected to contribute for the development of FeTS 2 toolbox. Consequently, the naming convention was updated to replace "post-treatment" with "timepoint" (e.g., post-treatment 1 became timepoint 1) to adhere to the instructions of the FeTS development team. Each sequence was saved in its own folder within these categories to a HIPPA compliant and secured server within the University of Missouri network. Exportation was conducted in DICOM format, maintaining the original image compression settings to preserve quality. To ensure patient privacy and HIPPA compliance, all images were anonymized and all protected health information (PHI) e.g. patient name, MRN, accession number, etc. were deleted from the metadata DICOM headers. The folders are labeled in the following structure:Abstract
Introduction
Methods
Subject Inclusion and Exclusion Criteria
Data Acquisition
Data Analysis
Usage Notes
Data Access
Version 1: Updated 2025/03/21
Title | Data Type | Format | Access Points | Subjects | License | Metadata | |||
---|---|---|---|---|---|---|---|---|---|
Images (skull-stripped) and Segmentations | Segmentation | NIFTI | Download requires IBM-Aspera-Connect plugin |
203 | 2,978 | CC BY 4.0 | View | ||
Segmentation Volumes | Measurement, Radiomic Feature | XLSX | 203 | CC BY 4.0 | — | ||||
Clinical Data | Demographic, Diagnosis, Molecular Test, Treatment, Follow-Up, Other | XLSX | 203 | CC BY 4.0 | — |
Citations & Data Usage Policy
Data Citation Required: Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution must include the following citation, including the Digital Object Identifier:
Data Citation |
|
Yaseen, D., Garrett, F., Gass, J., Greaser, J., Isufi, E., Layfield, L. J., Nada, A., Porgorzelski, K., Sinclair, J., Tahon, N. H. M., & Thacker, J. (2025). University of Missouri Post-operative Glioma Dataset (MU-Glioma-Post) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/7K9K-3C83 |
Detailed Description
Table 1: Summary distribution of different Siemens scanners
Count | Percentage | |
Avanto | 42 | 7.25% |
Aera | 270 | 46.63% |
Symphony | 6 | 1.04% |
Skyra | 192 | 33.16% |
Vida | 69 | 11.92% |
Total | 579 | 100.00% |
Table 2: Summary of MR parameters of the four basic MR sequences.
T1 | T1CE | T2 | T2FLAIR | |||||||||
Median | Mode | Range | Median | Mode | Range | Median | Mode | Range | Median | Mode | Range | |
Slice Thickness | 1 | 1 | 0.9 – 5.5 | 1 | 1 | 0.800 – 5.5 | 5 | 5 | 1 – 5.5 | 5 | 5 | 1 – 5 |
SAR | 0.092 | 0.085 | 0.006 – 0.86 | 0.117 | 0.426 | 0.01 – 2.61 | 0.503 | 0.237 | 0.003 – 2.99 | 0.162 | 0.077 | 0.009 – 1.151 |
ET | 2.97 | 2.97 | 2.17 – 28.98 | 2.97 | 2.97 | 1.85 – 140.62 | 96 | 87 | 1.78 – 118 | 87 | 87 | 19.9 – 442 |
RT | 2000 | 2300 | 220 – 2300 | 1490 | 1900 | 6.22 – 9502 | 4500 | 5000 | 4.6 – 7096.779 | 8000 | 8000 | 750 – 11000 |
IT | 900 | 900 | 900 – 1100 | 900 | 900 | 0 – 2500 | 2370 | 2370 | 1350 – 2800 | |||
Flip Angle | 9 | 8 | 8 – 150 | 15 | 90 | 7 – 160 | 150 | 150 | 20 – 180 | 150 | 150 | 20 – 180 |
Pixel Bandwidth | 199 | 150 | 90 – 630 | 185 | 160 | 100 -630 | 215 | 190 | 100 – 600 | 290 | 190 | 97.656 – 1130 |
Table 3: Summary of 2D vs 3D of each MR sequence.
2D | 3D | |||
Count | Percentage | Count | Percentage | |
T1 | 219 | 36.97% | 375 | 63.13% |
T1CE | 229 | 38.55% | 365 | 61.45% |
T2 | 593 | 99.83% | 1 | 0.17% |
FLAIR | 378 | 63.64% | 216 | 36.36% |
Acknowledgements
We would like to acknowledge the BraTS team for assisting with the processing and automatic segmenting of our dataset.
Related Publications
Publications by the Dataset Authors
The authors recommended the following as the best source of additional information about this dataset:
Moawad AW, et al. The Brain Tumor Segmentation – Metastases (BraTS-METS) Challenge 2023: Brain Metastasis Segmentation on Pre-treatment MRI. ArXiv [Preprint]. 2024 Dec 9:arXiv:2306.00838v3. PMID: 37396600; PMCID: PMC10312806.
De Verdier, M. C., Saluja, R., Gagnon, L., LaBella, D., Baid, U., Tahon, N. H., Zhang, J., Alafif, M., Baig, S., Chang, K., Deptula, L., Gupta, D., Haider, M. A., Hussain, A., Iv, M., Kontzialis, M., Manning, P., Moodi, F., Nunes, T., . . . Rudie, J. D. (2024). The 2024 Brain Tumor Segmentation (BraTS) Challenge: Glioma Segmentation on Post-treatment MRI. ArXiv. https://arxiv.org/abs/2405.18368
Research Community Publications
TCIA maintains a list of publications that leveraged this dataset. If you have a manuscript you’d like to add please contact TCIA’s Helpdesk.