Beyond Breast Density: Risk Measures for Breast Cancer in Multiple Imaging Modalities
Acciavatti, Raymond J; Lee, Su Hyun; Reig, Beatriu; Moy, Linda; Conant, Emily F; Kontos, Despina; Moon, Woo Kyung
Breast density is an independent risk factor for breast cancer. In digital mammography and digital breast tomosynthesis, breast density is assessed visually using the four-category scale developed by the American College of Radiology Breast Imaging Reporting and Data System (5th edition as of November 2022). Epidemiologically based risk models, such as the Tyrer-Cuzick model (version 8), demonstrate superior modeling performance when mammographic density is incorporated. Beyond just density, a separate mammographic measure of breast cancer risk is parenchymal textural complexity. With advancements in radiomics and deep learning, mammographic textural patterns can be assessed quantitatively and incorporated into risk models. Other supplemental screening modalities, such as breast US and MRI, offer independent risk measures complementary to those derived from mammography. Breast US allows the two components of fibroglandular tissue (stromal and glandular) to be visualized separately in a manner that is not possible with mammography. A higher glandular component at screening breast US is associated with higher risk. With MRI, a higher background parenchymal enhancement of the fibroglandular tissue has also emerged as an imaging marker for risk assessment. Imaging markers observed at mammography, US, and MRI are powerful tools in refining breast cancer risk prediction, beyond mammographic density alone.
ChatGPT and Other Large Language Models Are Double-edged Swords [Editorial]
Shen, Yiqiu; Heacock, Laura; Elias, Jonathan; Hentel, Keith D; Reig, Beatriu; Shih, George; Moy, Linda
New Horizons: Artificial Intelligence for Digital Breast Tomosynthesis
Goldberg, Julia E; Reig, Beatriu; Lewin, Alana A; Gao, Yiming; Heacock, Laura; Heller, Samantha L; Moy, Linda
The use of digital breast tomosynthesis (DBT) in breast cancer screening has become widely accepted, facilitating increased cancer detection and lower recall rates compared with those achieved by using full-field digital mammography (DM). However, the use of DBT, as compared with DM, raises new challenges, including a larger number of acquired images and thus longer interpretation times. While most current artificial intelligence (AI) applications are developed for DM, there are multiple potential opportunities for AI to augment the benefits of DBT. During the diagnostic steps of lesion detection, characterization, and classification, AI algorithms may not only assist in the detection of indeterminate or suspicious findings but also aid in predicting the likelihood of malignancy for a particular lesion. During image acquisition and processing, AI algorithms may help reduce radiation dose and improve lesion conspicuity on synthetic two-dimensional DM images. The use of AI algorithms may also improve workflow efficiency and decrease the radiologist's interpretation time. There has been significant growth in research that applies AI to DBT, with several algorithms approved by the U.S. Food and Drug Administration for clinical implementation. Further development of AI models for DBT has the potential to lead to improved practice efficiency and ultimately improved patient health outcomes of breast cancer screening and diagnostic evaluation. See the invited commentary by Bahl in this issue. Â©RSNA, 2022.
Improving breast cancer diagnostics with deep learning for MRI
Witowski, Jan; Heacock, Laura; Reig, Beatriu; Kang, Stella K; Lewin, Alana; Pysarenko, Kristine; Patel, Shalin; Samreen, Naziya; Rudnicki, Wojciech; ÅuczyÅ„ska, ElÅ¼bieta; Popiela, Tadeusz; Moy, Linda; Geras, Krzysztof J
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has a high sensitivity in detecting breast cancer but often leads to unnecessary biopsies and patient workup. We used a deep learning (DL) system to improve the overall accuracy of breast cancer diagnosis and personalize management of patients undergoing DCE-MRI. On the internal test set (n = 3936 exams), our system achieved an area under the receiver operating characteristic curve (AUROC) of 0.92 (95% CI: 0.92 to 0.93). In a retrospective reader study, there was no statistically significant difference (P = 0.19) between five board-certified breast radiologists and the DL system (mean Î”AUROC, +0.04 in favor of the DL system). Radiologists' performance improved when their predictions were averaged with DL's predictions [mean Î”AUPRC (area under the precision-recall curve), +0.07]. We demonstrated the generalizability of the DL system using multiple datasets from Poland and the United States. An additional reader study on a Polish dataset showed that the DL system was as robust to distribution shift as radiologists. In subgroup analysis, we observed consistent results across different cancer subtypes and patient demographics. Using decision curve analysis, we showed that the DL system can reduce unnecessary biopsies in the range of clinically relevant risk thresholds. This would lead to avoiding biopsies yielding benign results in up to 20% of all patients with BI-RADS category 4 lesions. Last, we performed an error analysis, investigating situations where DL predictions were mostly incorrect. This exploratory work creates a foundation for deployment and prospective analysis of DL-based models for breast MRI.
Differences between human and machine perception in medical diagnosis
Makino, Taro; JastrzÄ™bski, StanisÅ‚aw; Oleszkiewicz, Witold; Chacko, Celin; Ehrenpreis, Robin; Samreen, Naziya; Chhor, Chloe; Kim, Eric; Lee, Jiyon; Pysarenko, Kristine; Reig, Beatriu; Toth, Hildegard; Awal, Divya; Du, Linda; Kim, Alice; Park, James; Sodickson, Daniel K; Heacock, Laura; Moy, Linda; Cho, Kyunghyun; Geras, Krzysztof J
Deep neural networks (DNNs) show promise in image-based medical diagnosis, but cannot be fully trusted since they can fail for reasons unrelated to underlying pathology. Humans are less likely to make such superficial mistakes, since they use features that are grounded on medical science. It is therefore important to know whether DNNs use different features than humans. Towards this end, we propose a framework for comparing human and machine perception in medical diagnosis. We frame the comparison in terms of perturbation robustness, and mitigate Simpson's paradox by performing a subgroup analysis. The framework is demonstrated with a case study in breast cancer screening, where we separately analyze microcalcifications and soft tissue lesions. While it is inconclusive whether humans and DNNs use different features to detect microcalcifications, we find that for soft tissue lesions, DNNs rely on high frequency components ignored by radiologists. Moreover, these features are located outside of the region of the images found most suspicious by radiologists. This difference between humans and machines was only visible through subgroup analysis, which highlights the importance of incorporating medical domain knowledge into the comparison.
Breast Inflammatory Change Is Transient Following COVID-19 Vaccination
Kim, Eric; Reig, Beatriu
Axillary Adenopathy after COVID-19 Vaccine: No Reason to Delay Screening Mammogram
Wolfson, Stacey; Kim, Eric; Plaunova, Anastasia; Bukhman, Rita; Sarmiento, Ruth D; Samreen, Naziya; Awal, Divya; Sheth, Monica M; Toth, Hildegard B; Moy, Linda; Reig, Beatriu
Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams
Shen, Yiqiu; Shamout, Farah E; Oliver, Jamie R; Witowski, Jan; Kannan, Kawshik; Park, Jungkyu; Wu, Nan; Huddleston, Connor; Wolfson, Stacey; Millet, Alexandra; Ehrenpreis, Robin; Awal, Divya; Tyma, Cathy; Samreen, Naziya; Gao, Yiming; Chhor, Chloe; Gandhi, Stacey; Lee, Cindy; Kumari-Subaiya, Sheila; Leonard, Cindy; Mohammed, Reyhan; Moczulski, Christopher; Altabet, Jaime; Babb, James; Lewin, Alana; Reig, Beatriu; Moy, Linda; Heacock, Laura; Geras, Krzysztof J
Though consistently shown to detect mammographically occult cancers, breast ultrasound has been noted to have high false-positive rates. In this work, we present an AI system that achieves radiologist-level accuracy in identifying breast cancer in ultrasound images. Developed on 288,767 exams, consisting of 5,442,907 B-mode and Color Doppler images, the AI achieves an area under the receiver operating characteristic curve (AUROC) of 0.976 on a test set consisting of 44,755 exams. In a retrospective reader study, the AI achieves a higher AUROC than the average of ten board-certified breast radiologists (AUROC: 0.962 AI, 0.924â€‰Â±â€‰0.02 radiologists). With the help of the AI, radiologists decrease their false positive rates by 37.3% and reduce requested biopsies by 27.8%, while maintaining the same level of sensitivity. This highlights the potential of AI in improving the accuracy, consistency, and efficiency of breast ultrasound diagnosis.
Lessons from the first DBTex Challenge
Park, Jungkyu; Shoshan, Yoel; Marti, Robert; Gómez del Campo, Pablo; Ratner, Vadim; Khapun, Daniel; Zlotnick, Aviad; Barkan, Ella; Gilboa-Solomon, Flora; ChÅ‚Ä™dowski, Jakub; Witowski, Jan; Millet, Alexandra; Kim, Eric; Lewin, Alana; Pysarenko, Kristine; Chen, Sardius; Goldberg, Julia; Patel, Shalin; Plaunova, Anastasia; Wegener, Melanie; Wolfson, Stacey; Lee, Jiyon; Hava, Sana; Murthy, Sindhoora; Du, Linda; Gaddam, Sushma; Parikh, Ujas; Heacock, Laura; Moy, Linda; Reig, Beatriu; Rosen-Zvi, Michal; Geras, Krzysztof J.
Radiomics and deep learning methods in expanding the use of screening breast MRI [Editorial]
KEY POINTS/CONCLUSIONS:â€¢ The use of screening breast MRI is expanding beyond high-risk women to include intermediate- and average-risk women.â€¢ The study by PÃ¶tsch et al uses a radiomics-based method to decrease the number of benign biopsies while maintaining high sensitivity.â€¢ Future studies will likely increasingly focus on deep learning methods and abbreviated MRI data.