Breast-Lesions-USG | A Curated Benchmark Dataset for Ultrasound Based Breast Lesion Analysis
DOI: 10.7937/9WKK-Q141 | Data Citation Required | 925 Views | 1 Citations | Image Collection
Location | Species | Subjects | Data Types | Cancer Types | Size | Status | Updated | |
---|---|---|---|---|---|---|---|---|
Breast | Human | 256 | US, Segmentation, Radiomic Feature, Classification, Diagnosis, Follow-Up | Breast Cancer | Software/Source Code | Public, Complete | 2024/01/08 |
Summary
This dataset consists of 256 breast ultrasound scans collected from 256 patients and 266 benign and malignant segmented lesions. It includes patient-level labels, image-level annotations, and tumor-level labels with all cases confirmed by follow-up care or biopsy result. Each scan was manually annotated and labeled by a radiologist experienced in breast ultrasound examination. In particular, each tumor was identified in the image via a freehand annotation and labeled according to BIRADS features. The tumor histopathological classification is stated for patients who underwent a biopsy. Patient-level labels include clinical data such as age, breast tissue composition, signs and symptoms. Image-level freehand annotations identify the tumor and other abnormal areas in the image. The tumor and image are labeled with BIRADS category, 7 BIRADS descriptors, and interpretation of critical findings as presence of breast diseases. Additional labels include the method of verification, tumor classification and histopathological diagnosis. Since the role of machine learning and theoretical computing towards the development of augmented inference in the field of cancer detection is indisputable, the quality of the data used to develop any explainable augmented inference methods is extremely important. This dataset can be used as an external testing set for assessing a model’s performance and for developing explainable AI or supervised machine learning models for the detection, segmentation and classification of breast abnormalities in ultrasound images. A detailed description of this dataset can be found here and should be cited along with the citation of the data:https://doi.org/10.1038/s41597-024-02984-z.
Data Access
Version 1: Updated 2024/01/08
Title | Data Type | Format | Access Points | Subjects | License | |||
---|---|---|---|---|---|---|---|---|
Images and segmentations | US, Segmentation | PNG and ZIP | 256 | 256 | 522 | CC BY 4.0 | ||
Clinical data | Radiomic Feature, Classification, Diagnosis, Follow-Up | XLSX | 256 | CC BY 4.0 |
Additional Resources for this Dataset
The following external resources have been made available by the data submitters. These are not hosted or supported by TCIA, but may be useful to researchers utilizing this collection.
- Files for simple import of the data into matlab and python variables are available at https://github.com/best-ippt-pan-pl/BrEaST
Citations & Data Usage Policy
Data Citation Required: Users must abide by the TCIA Data Usage Policy and Restrictions. Attribution must include the following citation, including the Digital Object Identifier:
Data Citation |
|
Pawłowska, A., Ćwierz-Pieńkowska, A., Domalik, A., Jaguś, D., Kasprzak, P., Matkowski, R., Fura, Ł., Nowicki, A., & Zolek, N. (2024). A Curated Benchmark Dataset for Ultrasound Based Breast Lesion Analysis (Breast-Lesions-USG) (Version 1) [dataset]. The Cancer Imaging Archive. https://doi.org/10.7937/9WKK-Q141 |
Acknowledgements
We would like to acknowledge the individuals and institutions that have provided data for this collection:
The preparation of the dataset was supported by National Centre for Research and Development project INFOSTRATEG-I/0042/2021
Related Publications
Publications by the Dataset Authors
The authors recommended the following as the best source of additional information about this dataset:
Publication Citation |
|
Pawłowska, A., Ćwierz-Pieńkowska, A., Domalik, A., Jaguś, D., Kasprzak, P., Matkowski, R., Fura, Ł., Nowicki, A., & Zolek, N. A Curated benchmark dataset for ultrasound based breast lesion analysis. Sci Data 11, 148 (2024). https://doi.org/10.1038/s41597-024-02984-z |
No other publications by dataset authors were recommended.
Research Community Publications
TCIA maintains a list of publications that leveraged this dataset. If you have a manuscript you’d like to add please contact TCIA’s Helpdesk.