New Horizons: Artificial Intelligence for Digital Breast Tomosynthesis
The use of digital breast tomosynthesis (DBT) in breast cancer screening has become widely accepted, facilitating increased cancer detection and lower recall rates compared with those achieved by using full-field digital mammography (DM). However, the use of DBT, as compared with DM, raises new challenges, including a larger number of acquired images and thus longer interpretation times. While most current artificial intelligence (AI) applications are developed for DM, there are multiple potential opportunities for AI to augment the benefits of DBT. During the diagnostic steps of lesion detection, characterization, and classification, AI algorithms may not only assist in the detection of indeterminate or suspicious findings but also aid in predicting the likelihood of malignancy for a particular lesion. During image acquisition and processing, AI algorithms may help reduce radiation dose and improve lesion conspicuity on synthetic two-dimensional DM images. The use of AI algorithms may also improve workflow efficiency and decrease the radiologist's interpretation time. There has been significant growth in research that applies AI to DBT, with several algorithms approved by the U.S. Food and Drug Administration for clinical implementation. Further development of AI models for DBT has the potential to lead to improved practice efficiency and ultimately improved patient health outcomes of breast cancer screening and diagnostic evaluation. See the invited commentary by Bahl in this issue. Â©RSNA, 2022.
Improving breast cancer diagnostics with deep learning for MRI
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has a high sensitivity in detecting breast cancer but often leads to unnecessary biopsies and patient workup. We used a deep learning (DL) system to improve the overall accuracy of breast cancer diagnosis and personalize management of patients undergoing DCE-MRI. On the internal test set (n = 3936 exams), our system achieved an area under the receiver operating characteristic curve (AUROC) of 0.92 (95% CI: 0.92 to 0.93). In a retrospective reader study, there was no statistically significant difference (P = 0.19) between five board-certified breast radiologists and the DL system (mean Î”AUROC, +0.04 in favor of the DL system). Radiologists' performance improved when their predictions were averaged with DL's predictions [mean Î”AUPRC (area under the precision-recall curve), +0.07]. We demonstrated the generalizability of the DL system using multiple datasets from Poland and the United States. An additional reader study on a Polish dataset showed that the DL system was as robust to distribution shift as radiologists. In subgroup analysis, we observed consistent results across different cancer subtypes and patient demographics. Using decision curve analysis, we showed that the DL system can reduce unnecessary biopsies in the range of clinically relevant risk thresholds. This would lead to avoiding biopsies yielding benign results in up to 20% of all patients with BI-RADS category 4 lesions. Last, we performed an error analysis, investigating situations where DL predictions were mostly incorrect. This exploratory work creates a foundation for deployment and prospective analysis of DL-based models for breast MRI.
Response to Letter to JACR regarding recently released ACR Appropriateness Criteria Supplemental Breast Cancer Screening [Letter]
ACR Appropriateness CriteriaÂ® Imaging of the Axilla
This publication reviews the current evidence supporting the imaging approach of the axilla in various scenarios with broad differential diagnosis ranging from inflammatory to malignant etiologies. Controversies on the management of axillary adenopathy results in disagreement on the appropriate axillary imaging tests. Ultrasound is often the appropriate initial imaging test in several clinical scenarios. Clinical information (such as age, physical examinations, risk factors) and concurrent complete breast evaluation with mammogram, tomosynthesis, or MRI impact the type of initial imaging test for the axilla. Several impactful clinical trials demonstrated that selected patient's population can received sentinel lymph node biopsy instead of axillary lymph node dissection with similar overall survival, and axillary lymph node dissection is a safe alternative as the nodal staging procedure for clinically node negative patients or even for some node positive patients with limited nodal tumor burden. This approach is not universally accepted, which adversely affect the type of imaging tests considered appropriate for axilla. This document is focused on the initial imaging of the axilla in various scenarios, with the understanding that concurrent or subsequent additional tests may also be performed for the breast. The American College of Radiology Appropriateness Criteria are evidence-based guidelines for specific clinical conditions that are reviewed annually by a multidisciplinary expert panel. The guideline development and revision include an extensive analysis of current medical literature from peer reviewed journals and the application of well-established methodologies (RAND/UCLA Appropriateness Method and Grading of Recommendations Assessment, Development, and Evaluation or GRADE) to rate the appropriateness of imaging and treatment procedures for specific clinical scenarios. In those instances where evidence is lacking or equivocal, expert opinion may supplement the available evidence to recommend imaging or treatment.
Prospective multicenter assessment of patient preferences for properties of gadolinium-based contrast media and their potential socioeconomic impact in a screening breast MRI setting
OBJECTIVE:It is unknown how patients prioritize gadolinium-based contrast media (GBCM) benefits (detection sensitivity) and risks (reactions, gadolinium retention, cost). The purpose of this study is to measure preferences for properties of GBCM in women at intermediate or high risk of breast cancer undergoing annual screening MRI. METHODS:An institutional reviewed board-approved prospective discrete choice conjoint survey was administered to patients at intermediate or high risk for breast cancer undergoing screening MRI at 4 institutions (July 2018-March 2020). Participants were given 15 tasks and asked to choose which of two hypothetical GBCM they would prefer. GBCMs varied by the following attributes: sensitivity for cancer detection (80-95%), intracranial gadolinium retention (1-100 molecules per 100 million administered), severe allergic-like reaction rate (1-19 per 100,000 administrations), mild allergic-like reaction rate (10-1000 per 100,000 administrations), out-of-pocket cost ($25-$100). Attribute levels were based on published values of existing GBCMs. Hierarchical Bayesian analysis was used to derive attribute "importance." Preference shares were determined by simulation. RESULTS:Response (87% [247/284]) and completion (96% [236/247]) rates were excellent. Sensitivity (importance = 44.3%, 95% confidence interval = 42.0-46.7%) was valued more than GBCM-related risks (mild allergic-like reaction risk (19.5%, 17.9-21.1%), severe allergic-like reaction risk (17.0%, 15.8-18.1%), intracranial gadolinium retention (11.6%, 10.5-12.7%), out-of-pocket expense (7.5%, 6.8-8.3%)). Lower income participants placed more importance on cost and less on sensitivity (p < 0.01). A simulator is provided that models GBCM preference shares by GBCM attributes and competition. CONCLUSIONS:Patients at intermediate or high risk for breast cancer undergoing MRI screening prioritize cancer detection over GBCM-related risks, and prioritize reaction risks over gadolinium retention. KEY POINTS/CONCLUSIONS:â€¢ Among women undergoing annual breast MRI screening, cancer detection sensitivity (attribute "importance," 44.3%) was valued more than GBCM-related risks (mild allergic reaction risk 19.5%, severe allergic reaction risk 17.0%, intracranial gadolinium retention 11.6%, out-of-pocket expense 7.5%). â€¢ Prospective four-center patient preference data have been incorporated into a GBCM choice simulator that allows users to input GBCM properties and calculate patient preference shares for competitor GBCMs. â€¢ Lower-income women placed more importance on out-of-pocket cost and less importance on cancer detection (p < 0.01) when prioritizing GBCM properties.
ACR Appropriateness CriteriaÂ® Supplemental Breast Cancer Screening Based on Breast Density
Mammography remains the only validated screening tool for breast cancer, however, there are limitations to mammography. One of the limitations of mammography is the variable sensitivity based on breast density. Supplemental screening may be considered based on the patient's risk level and breast density. For average-risk women with nondense breasts, the sensitivity of digital breast tomosynthesis (DBT) screening is high; additional supplemental screening is not warranted in this population. For average-risk women with dense breasts, given the decreased sensitivity of mammography/DBT, this population may benefit from additional supplemental screening with contrast-enhanced mammography, screening ultrasound (US), breast MRI, or abbreviated breast MRI. In intermediate-risk women, there is emerging evidence suggesting that women in this population may benefit from breast MRI or abbreviated breast MRI. In intermediate-risk women with dense breasts, given the decreased sensitivity of mammography/DBT, this population may benefit from additional supplemental screening with contrast-enhancedmammography or screening US. There is strong evidence supporting screening high-risk women with breast MRI regardless of breast density. Contrast-enhanced mammography, whole breast screening US, or abbreviated breast MRI may be also considered. The American College of Radiology Appropriateness Criteria are evidence-based guidelines for specific clinical conditions that are reviewed annually by a multidisciplinary expert panel. The guideline development and revision include an extensive analysis of current medical literature from peer reviewed journals and the application of well-established methodologies (RAND/UCLA Appropriateness Method and Grading of Recommendations Assessment, Development, and Evaluation or GRADE) to rate the appropriateness of imaging and treatment procedures for specific clinical scenarios. In those instances where evidence is lacking or equivocal, expert opinion may supplement the available evidence to recommend imaging or treatment.
Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams
Though consistently shown to detect mammographically occult cancers, breast ultrasound has been noted to have high false-positive rates. In this work, we present an AI system that achieves radiologist-level accuracy in identifying breast cancer in ultrasound images. Developed on 288,767 exams, consisting of 5,442,907 B-mode and Color Doppler images, the AI achieves an area under the receiver operating characteristic curve (AUROC) of 0.976 on a test set consisting of 44,755 exams. In a retrospective reader study, the AI achieves a higher AUROC than the average of ten board-certified breast radiologists (AUROC: 0.962 AI, 0.924â€‰Â±â€‰0.02 radiologists). With the help of the AI, radiologists decrease their false positive rates by 37.3% and reduce requested biopsies by 27.8%, while maintaining the same level of sensitivity. This highlights the potential of AI in improving the accuracy, consistency, and efficiency of breast ultrasound diagnosis.
Lessons from the first DBTex Challenge
Bilateral gradient-echo spectroscopic imaging with correction of frequency variations for measurement of fatty acid composition in mammary adipose tissue
PURPOSE/OBJECTIVE:To develop a simultaneous dual-slab three-dimensional gradient-echo spectroscopic imaging (GSI) technique with frequency drift compensation for rapid (<6 min) bilateral measurement of fatty acid composition (FAC) in mammary adipose tissue. METHODS:A bilateral GSI sequence was developed using a simultaneous dual-slab excitation followed by 128 monopolar echoes. A short train of navigator echoes without phase or partition encoding was included at the beginning of each pulse repetition time period to correct for frequency variation caused by respiration and heating of the cryostat. Voxel-wise spectral fitting was applied to measure the areas of the lipid spectral peaks to estimate the number of double-bond (ndb), number of methylene-interrupted double-bond (nmidb), and chain length (cl). The proposed method was tested in an oil phantom and 10 postmenopausal women to assess the influence of the frequency variation on FAC estimation. RESULTS:The frequency drift observed over 5:27 min during the phantom scan was about 10 Hz. Phase correction based on the navigator reduced the median error of ndb, nmidb, and cl from 9.7%, 17.6%, and 3.2% to 2.1%, 9.5%, and 2.8%, respectively. The in vivo data showed a mean Â± standard deviation frequency drift of 17.4 Â± 2.5 Hz, with ripples at 0.3 Â± 0.1 Hz. Our reconstruction algorithm successfully separated signals from the left and right breasts with negligible residual aliasing. Phase correction reduced the interquartile range within each subject's adipose tissue of ndb, nmidb, and cl by 18.4 Â± 10.6%, 18.5 Â± 13.9%, and 18.4 Â± 10.6%, respectively. CONCLUSION/CONCLUSIONS:This study shows the feasibility of obtaining bilateral spectroscopic imaging data in the breast and that incorporation of a frequency navigator improves the estimation of FAC.
Breast MRI for Evaluation of Response to Neoadjuvant Therapy
Neoadjuvant therapy is increasingly being used to treat early-stage triple-negative and human epidermal growth factor 2-overexpressing breast cancers, as well as locally advanced and inflammatory breast cancers. The rationales for neoadjuvant therapy are to shrink tumor size and potentially decrease the extent of surgery, to serve as an in vivo test of response to therapy, and to reveal prognostic information for the patient. MRI is the most accurate modality to demonstrate response to therapy and to help ensure accurate presurgical planning. Changes in lesion diameter, volume, and enhancement are used to predict complete response, partial response, or nonresponse to therapy. However, residual disease may be overestimated or underestimated at MRI. Fibrosis, necrotic tumors, and residual benign masses may be causes of overestimation of residual disease. Nonmass lesions, invasive lobular carcinoma, hormone receptor-positive tumors, nonconcentric shrinkage patterns, the use of antiangiogenic therapy, and late-enhancing foci may be causes of underestimation of residual disease. In patients with known axillary lymph node metastasis, neoadjuvant therapy may be followed by targeted axillary dissection to avoid the potential morbidity associated with an axillary lymph node dissection. Diffusion-weighted imaging, radiomics, machine learning, and deep learning methods are under investigation to improve MRI accuracy in predicting treatment response.Â©RSNA, 2021.