[IJHCS] UTA7: a dataset of rates (BI-RADS) provided by clinicians resulted from classifying the given medical images for breast cancer diagnosis.
Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the research community to focus on hard problems. In this repository, we present a limited sampling of the achieved rates during the User Tests and Analysis 7 (UTA7) study. The rates are representative of the number of False-Positives and False-Negatives during breast cancer diagnosis. Across breast cancer diagnosis, several medical imaging modalities were acquired and classified by several clinicians. The type of data included in this repository contains a narrower range of values extracted during the medical imaging diagnosis of several patients. Specifically, narrower range of values are part of the BI-RADS results provided by clinicians. Moreover, the representative DICOM files for those BI-RADS results are available in the dataset-uta7-dicom
repository. Paired to this dataset-uta7-rates
repository and the later dataset-uta7-dicom
repository, the dataset-uta7-heatmaps
repository is used to compute a visual representation of the dataset-uta7-annotations
repository. In this dataset-uta7-rates
repository, we provide a dataset measuring the number of false-positives and false-negatives during our user tests. The user tests were made in clinical institutions, where clinicians diagnose several patients without and with an AI-Assistant agent. Here, we also provide a dataset/
of the extrated BI-RADS scale of UTA7 tasks. In our study, a survey template [DOI: 10.13140/RG.2.2.36306.86725] was used to extract the BI-RADS answers from clinicians. Work and results are published on a top Human-Computer Interaction (HCI) journal named International Journal of Human-Computer Studies (IJHCS). Results were analyzed and interpreted on our Statistical Analysis charts. The user tests were made in clinical institutions, where clinicians diagnose several patients for a Multi-Modality vs Assistant comparison. For example, in these tests, we used the prototype-multi-modality
, prototype-multi-modality-assistant
and prototype-heatmap
repositories for the comparison. On the same hand, the hereby dataset represents the pieces of information of both BreastScreening and MIDA research works. These projects are research projects that deal with the use of a recently proposed technique in literature: Deep Convolutional Neural Networks (CNNs). From a developed User Interface (UI) and framework, these deep networks will incorporate several datasets in different modes. For more information about the available datasets please follow the Datasets page on the Wiki of the meta
information repository. Last but not least, you can find further information on the Wiki in this repository. We also have several demos to see in our YouTube Channel, please follow us.
We kindly ask scientific works and studies that make use of the repository to cite it in their associated publications. Similarly, we ask open-source and closed-source works that make use of the repository to warn us about this use.
You can cite our work using the following BibTeX entry:
@article{CALISTO2021102607,
title = {Introduction of Human-Centric AI Assistant to Aid Radiologists for Multimodal Breast Image Classification},
journal = {International Journal of Human-Computer Studies},
pages = {102607},
year = {2021},
issn = {1071-5819},
doi = {https://doi.org/10.1016/j.ijhcs.2021.102607},
url = {https://www.sciencedirect.com/science/article/pii/S1071581921000252},
author = {Francisco Maria Calisto and Carlos Santiago and Nuno Nunes and Jacinto C. Nascimento},
keywords = {Human-Computer Interaction, Artificial Intelligence, Healthcare, Medical Imaging, Breast Cancer},
abstract = {In this research, we take an HCI perspective on the opportunities provided by AI techniques in medical imaging, focusing on workflow efficiency and quality, preventing errors and variability of diagnosis in Breast Cancer. Starting from a holistic understanding of the clinical context, we developed BreastScreening to support Multimodality and integrate AI techniques (using a deep neural network to support automatic and reliable classification) in the medical diagnosis workflow. This was assessed by using a significant number of clinical settings and radiologists. Here we present: i) user study findings of 45 physicians comprising nine clinical institutions; ii) list of design recommendations for visualization to support breast screening radiomics; iii) evaluation results of a proof-of-concept BreastScreening prototype for two conditions Current (without AI assistant) and AI-Assisted; and iv) evidence from the impact of a Multimodality and AI-Assisted strategy in diagnosing and severity classification of lesions. The above strategies will allow us to conclude about the behaviour of clinicians when an AI module is present in a diagnostic system. This behaviour will have a direct impact in the clinicians workflow that is thoroughly addressed herein. Our results show a high level of acceptance of AI techniques from radiologists and point to a significant reduction of cognitive workload and improvement in diagnosis execution.}
}
The following list is showing the required dependencies for this project to run locally:
Here are some tutorials and documentation, if needed, to feel more comfortable about using and playing around with this repository:
Usage follow the instructions here to setup the current repository and extract the present data. To understand how the hereby repository is used for, read the following steps.
At this point, the only way to install this repository is manual. Eventually, this will be accessible through git, as mentioned on the roadmap.
Nonetheless, this kind of installation is as simple as cloning this repository. Virtually all Git and GitHub version control tools are capable of doing that. Through the console, we can use the command below, but other ways are also fine.
git clone https://github.com/MIMBCD-UI/dataset-uta7-rates.git
We need to follow the repository goal, by addressing the thereby information. Therefore, it is of chief importance to scale this solution supported by the repository. The repository solution follows the best practices, achieving the Core Infrastructure Initiative (CII) specifications.
Besides that, one of our goals involves creating a configuration file to automatically test and publish our code. It will be most likely prepared for the GitHub Actions. Other goals may be written here in the future.
This project exists thanks to all the people who contribute. We welcome everyone who wants to help us improve this repository. As follows, we present some suggestions.
Either as something that seems missing or any need for support, just open a new issue. Regardless of being a simple request or a fully-structured feature, we will do our best to understand them and, eventually, solve them.
We like to develop, but we also like collaboration. You could ask us to add some features or more data… Or you could want to do it yourself and fork this repository. Maybe even do some side-project of your own. If the latter ones, please let us share some insights about what we currently have.
The current information will summarize important items of this repository. In this section, we address all fundamental items that were crucial to the current information.
The following list, represents the set of related repositories for the presented repository:
To publish our datasets we used a well known platform called Kaggle. To access our project’s Profile Page just follow the link. Last but not least, you can also follow our work at data.world, figshare.com and openml.org platforms.
Copyright © 2021 Instituto Superior Técnico
The dataset-uta7-rates
repository is distributed under the terms of GNU AGPLv3 license and CC-BY-SA-4.0 copyright. Permissions of this license are conditioned on making available complete elements from this repository of licensed works and modifications, which include larger works using a licensed work, under the same license. Copyright and license notices must be preserved.
Our team brings everything together sharing ideas and the same purpose, developing even better work. In this section, we will nominate the full list of important people for this repository, as well as respective links.
Francisco Maria Calisto [ Website | ResearchGate | GitHub | LinkedIn ] |
Carlos Santiago [ ResearchGate ]
Nuno Nunes [ ResearchGate ]
Hugo Lencastre [ ResearchGate ]
Nádia Mourão [ ResearchGate ]
This work was partially supported by national funds through FCT and IST through the UID/EEA/50009/2013 project, BL89/2017-IST-ID grant. We thank Dr. Clara Aleluia and her radiology team of HFF for valuable insights and helping using the Assistant on their daily basis. From IPO-Lisboa, we would like to thank the medical imaging teams of Dr. José Carlos Marques and Dr. José Venâncio. From IPO-Coimbra, we would like to thank the radiology department director and the all team of Dr. Idílio Gomes. Also, we would like to provide our acknowledgments to Dr. Emília Vieira and Dr. Cátia Pedro from Hospital Santa Maria. Furthermore, we want to thank all team from the radiology department of HB for participation. Last but not least, a great thanks to Dr. Cristina Ribeiro da Fonseca, who among others is giving us crucial information for the BreastScreening project.
Our organization is a non-profit organization. However, we have many needs across our activity. From infrastructure to service needs, we need some time and contribution, as well as help, to support our team and projects.
This project exists thanks to all the people who contribute. [Contribute].
Thank you to all our backers! 🙏 [Become a backer]
Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]