Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams
Though consistently shown to detect mammographically occult cancers, breast ultrasound has been noted to have high false-positive rates. In this work, we present an AI system that achieves radiologist-level accuracy in identifying breast cancer in ultrasound images. Developed on 288,767 exams, consisting of 5,442,907 B-mode and Color Doppler images, the AI achieves an area under the receiver operating characteristic curve (AUROC) of 0.976 on a test set consisting of 44,755 exams. In a retrospective reader study, the AI achieves a higher AUROC than the average of ten board-certified breast radiologists (AUROC: 0.962 AI, 0.924â€‰Â±â€‰0.02 radiologists). With the help of the AI, radiologists decrease their false positive rates by 37.3% and reduce requested biopsies by 27.8%, while maintaining the same level of sensitivity. This highlights the potential of AI in improving the accuracy, consistency, and efficiency of breast ultrasound diagnosis.
Preventing Physician Burnout in Breast Imaging: Scope of the Problem and Keys to Success
Physicians, including radiologists and specifically breast imagers, face many challenges, and stressors during their daily routine, many of which can contribute to burnout. While there is an increasing body of literature evaluating burnout, including its prevalence in and impact on radiologists, there is a relative lack of information specifically addressing this topic as it relates to breast imaging. This article reviews key concepts in burnout, describes the potential impact on physicians at all levels of training and work, highlights unique aspects to the specialty of breast imaging that may contribute to burnout, and suggests tool and/or strategies that may help to combat and prevent burnout among breast imagers.
Radiologist Characteristics Associated with Interpretive Performance of Screening Mammography: A National Mammography Database (NMD) Study
Background Factors affecting radiologists' performance in screening mammography interpretation remain poorly understood. Purpose To identify radiologists characteristics that affect screening mammography interpretation performance. Materials and Methods This retrospective study included 1223 radiologists in the National Mammography Database (NMD) from 2008 to 2019 who could be linked to Centers for Medicare & Medicaid Services (CMS) datasets. NMD screening performance metrics were extracted. Acceptable ranges were defined as follows: recall rate (RR) between 5% and 12%; cancer detection rate (CDR) of at least 2.5 per 1000 screening examinations; positive predictive value of recall (PPV1) between 3% and 8%; positive predictive value of biopsies recommended (PPV2) between 20% and 40%; positive predictive value of biopsies performed (PPV3) between the 25th and 75th percentile of study sample; invasive CDR of at least the 25th percentile of the study sample; and percentage of ductal carcinoma in situ (DCIS) of at least the 25th percentile of the study sample. Radiologist characteristics extracted from CMS datasets included demographics, subspecialization, and clinical practice patterns. Multivariable stepwise logistic regression models were performed to identify characteristics independently associated with acceptable performance for the seven metrics. The most influential characteristics were defined as those independently associated with the majority of the metrics (at least four). Results Relative to radiologists practicing in the Northeast, those in the Midwest were more likely to achieve acceptable RR, PPV1, PPV2, and CDR (odds ratio [OR], 1.4-2.5); those practicing in the West were more likely to achieve acceptable RR, PPV2, and PPV3 (OR, 1.7-2.1) but less likely to achieve acceptable invasive CDR (OR, 0.6). Relative to general radiologists, breast imagers were more likely to achieve acceptable PPV1, invasive CDR, percentage DCIS, and CDR (OR, 1.4-4.4). Those performing diagnostic mammography were more likely to achieve acceptable PPV1, PPV2, PPV3, invasive CDR, and CDR (OR, 1.9-2.9). Those performing breast US were less likely to achieve acceptable PPV1, PPV2, percentage DCIS, and CDR (OR, 0.5-0.7). Conclusion The geographic location of the radiology practice, subspecialization in breast imaging, and performance of diagnostic mammography are associated with better screening mammography performance; performance of breast US is associated with lower performance. Â©RSNA, 2021 Online supplemental material is available for this article.
Cancer Yield Exceeds 2% for BI-RADS 3 Probably Benign Findings in Women Older Than 60 Years in the National Mammography Database
Background Breast Imaging Reporting and Data System (BI-RADS) category 3 (BR3) (probably benign) mammographic assessments are reserved for imaging findings known to have likelihood of malignancy of 2% or less. Purpose To determine the effect of age, finding type, and prior mammography on cancer yield for BR3 findings in the National Mammography Database (NMD). Materials and Methods This HIPAA-compliant retrospective cohort institutional review board-exempt study evaluated women recalled from screening mammography followed by BR3 assessment at diagnostic evaluation from January 2009 to March 2018 and from 471 NMD facilities. Only the first BR3 occurrence was included for women with biopsy or imaging follow-up of at least 2 years. Women with a history of breast cancer or who underwent biopsy at time of initial BR3 assessment were excluded. Women were stratified by age in 10-year intervals. Cancer yield was calculated for each age group, with (for presumed new findings) and without prior mammographic comparison, and by lesion type, where available. Linear regression with weighted-age binning was performed to assess for differences between groups; P < .05 was indicative of a significant difference. Results A total of 1 380 652 (18.2%) women were recalled after screening mammography, of whom 157 130 (11.4%) were given a BR3 assessment within 90 days after screening. Of these, 43 628 women (median age, 55 years; age range, 25-90 years) had adequate follow-up for analysis. Cancer yield increased with increasing age decile, ranging from 0.51% (six of 1167) in women aged 30-39 years to 4.63% (41 of 885) in women aged 80-90 years; cancer yield exceeded 2% at and after age 59.7 years for baseline findings and at and after age 53.6 years for presumed new findings, although there was no effect on stage distribution. Cancer yield for baseline BR3 masses was 10 of 2111 (0.47% [95% CI: 0.24, 0.90]) versus 47 of 3003 (1.57% [95% CI: 1.16, 2.09]) with prior comparisons (P < .001); cancer yield for baseline calcifications was eight of 929 (0.86% [95% CI: 0.40, 1.76]) versus 84 of 2999 (2.80% [95% CI: 2.23, 3.47]) with prior comparisons (P < .001). Difference in cancer yield was 0.51% (95% CI: 0.16, 0.86) between women with and women without prior comparison at the same age (P = .006). Conclusion Cancer yield exceeded the 2% threshold for women aged 60 years or older and reached 4.6% for women aged 80-89 years. Breast Imaging Reporting and Data System 3 findings in women with a prior comparison had higher cancer yield than in those without a prior comparison at the same age. Â©â€‰RSNA, 2021 Online supplemental material is available for this article.
Imaging and Management of Internal Mammary Lymph Nodes
Internal mammary lymph nodes (IMLNs) account for approximately 10%-40% of the lymphatic drainage of the breast. Internal mammary lymph nodes measuring up to 10 mm are commonly seen on high-risk screening breast MRI examinations in patients without breast cancer and are considered benign if no other suspicious findings are present. Benign IMLNs demonstrate a fatty hilum, lobular or oval shape, and circumscribed margins without evidence of central necrosis, cortical thickening, or loss of fatty hilum. In patients with breast cancer, IMLN involvement can alter clinical stage and treatment planning. The incidence of IMLN metastases detected on US, CT, MRI, and PET-CT ranges from 10%-16%, with MRI and PET-CT demonstrating the highest sensitivities. Although there are no well-defined imaging criteria in the eighth edition of the American Joint Committee on Cancer Staging Manual for Breast Cancer, a long-axis measurement of â‰¥ 5 mm is suggested as a guideline to differentiate benign versus malignant IMLNs in patients with newly diagnosed breast cancer. Abnormal morphology such as loss of fatty hilum, irregular shape, and rounded appearance (which can be quantified by a short-axis/long-axis length ratio greater than 0.5) also raises suspicion for IMLN metastases. MRI and PET-CT have good sensitivity and specificity for the detection of IMLN metastases, but fluorodeoxyglucose avidity can be seen in both benign conditions and metastatic disease. US is helpful for staging, and US-guided fine-needle aspiration can be performed in cases of suspected IMLN metastasis. Management of suspicious IMLNs identified on imaging is typically with chemotherapy and radiation, as surgical excision does not provide survival benefit and is performed only in rare cases.
Consensus survey on pre-procedural safety practices in radiological examinations: a multicenter study in seven Asian regions
OBJECTIVE/UNASSIGNED:To understand the status of pre-procedural safety practices in radiological examinations at radiology residency training institutions in various Asian regions. METHODS/UNASSIGNED:A questionnaire based on the Joint Commission International Accreditation Standards was electronically sent to 3 institutions each in 10 geographical regions across 9 Asian countries. Questions addressing 45 practices were divided into 3 categories. A five-tier scale with numerical scores was used to evaluate safety practices in each institution. Responses obtained from three institutions in the United States were used to validate the execution rate of each surveyed safety practice. RESULTS/UNASSIGNED:The institutional response rate was 70.0% (7 Asian regions, 21 institutions). 44 practices (all those surveyed except for the application of wrist tags for identifying patients with fall risks) were validated using the US participants. Overall, the Asian participants reached a consensus on 89% of the safety practices. Comparatively, most Asian participants did not routinely perform three pre-procedural practices in the examination appropriateness topic. CONCLUSION/UNASSIGNED:Based on the responses from 21 participating Asian institutions, most routinely perform standard practices during radiological examinations except when it comes to examination appropriateness. This study can provide direction for safety policymakers scrutinizing and improving regional standards of care. ADVANCES IN KNOWLEDGE/UNASSIGNED:This is the first multicenter survey study to elucidate pre-procedural safety practices in radiological examinations in seven Asian regions.
Cancer Yield and Patterns of Follow-up for BI-RADS Category 3 after Screening Mammography Recall in the National Mammography Database
Background The literature supports the use of short-interval follow-up as an alternative to biopsy for lesions assessed as probably benign, Breast Imaging Reporting and Data System (BI-RADS) category 3, with an expected malignancy rate of less than 2%. Purpose To assess outcomes from 6-, 12-, and 24-month follow-up of probably benign findings first identified at recall from screening mammography in the National Mammography Database (NMD). Materials and Methods This retrospective study included women recalled from screening mammography with BI-RADS category 3 assessment at additional evaluation from January 2009 through March 2018 from 471 NMD facilities. Only the first BI-RADS category 3 occurrence for women aged 25 years or older with no personal history of breast cancer was analyzed, with biopsy or 2-year imaging follow-up. Cancer yield and positive predictive value of biopsies performed (PPV3) were determined at each follow-up. Results Among 45 202 women (median age, 55 years; range, 25-90 years) with a BI-RADS category 3 lesion, 1574 (3.5%) underwent biopsy at the time of lesion detection, yielding 72 cancers (cancer yield, 4.6%; 72 of 1574 women). For the remaining 43â€‰628 women who accepted surveillance, 922 were seen within 90 days (with 78 lesions biopsied and 12 [15%] classified as malignant). The women still in surveillance (31â€‰465 of 43â€‰381 women [72.5%]) underwent follow-up mammography at 6 months. Of 3001 (9.5%) lesions biopsied, 456 (15.2%) were malignant (cancer yield, 1.5%; 456 of 31â€‰465 women; 95% confidence interval [CI]: 1.3%, 1.6%). Among 18â€‰748 of 25â€‰997 women (72.1%) in surveillance who underwent follow-up at 12 months, 1219 (6.5%) underwent biopsy with 230 (18.9%) malignant lesions found (cancer yield, 1.2%; 230 of 18â€‰748 women; 95% CI: 1.1%, 1.4%). Through 2-year follow-up, the biopsy rate was 11.2% (4894 of 43â€‰628 women) with a cancer yield of 1.86% (810 malignancies found among 43â€‰628 women; 95% CI: 1.73%, 1.98%) and a PPV3 of 16.6% (810 malignancies found among 4894 women). Conclusion In the National Mammography Database, Breast Imaging Reporting and Data System (BI-RADS) category 3 use is appropriate, with 1.86% cumulative cancer yield through 2-year follow-up. Of 810 malignancies, 468 (57.8%) were diagnosed at or before 6 months, validating necessity of short-interval follow-up of mammographic BI-RADS category 3 findings. Â© RSNA, 2020 Online supplemental material is available for this article. See also the editorial by Moy in this issue.
Abbreviated breast MRI: Road to clinical implementation
Breast MRI offers high sensitivity for breast cancer detection, with preferential detection of high-grade invasive cancers when compared to mammography and ultrasound. Despite the clear benefits of breast MRI in cancer screening, its cost, patient tolerance, and low utilization remain key issues. Abbreviated breast MRI, in which only a select number of sequences and postcontrast imaging are acquired, exploits the high sensitivity of breast MRI while reducing table time and reading time to maximize availability, patient tolerance, and accessibility. Worldwide studies of varying patient populations have demonstrated that the comparable diagnostic accuracy of abbreviated breast MRI is comparable to a full diagnostic protocol, highlighting the emerging role of abbreviated MRI screening in patients with an intermediate and high lifetime risk of breast cancer. The purpose of this review is to summarize the background and current literature relating to abbreviated MRI, highlight various protocols utilized in current multicenter clinical trials, describe workflow and clinical implementation issues, and discuss the future of abbreviated protocols, including advanced MRI techniques.
Risk-Based Screening Mammography for Women Aged <40: Outcomes From the National Mammography Database
OBJECTIVE:There is insufficient large-scale evidence for screening mammography in women <40 years at elevated risk. This study compares risk-based screening of women aged 30 to 39 with risk factors versus women aged 40 to 49 without risk factors in the National Mammography Database (NMD). METHODS:). RESULTS:was 28.2% (27.0%-28.5%). Women aged 30 to 34 and 35 to 39 had similar CDR, RR, and PPVs, with the presence of the three evaluated risk factors associated with significantly higher CDR. Moreover, compared with a population currently recommended for screening mammography in the United States (aged 40-49 at average risk), incidence screening (at least one prior screening examination) of women aged 30 to 39 with the three evaluated risk factors has similar cancer detection rates and recall rates. DISCUSSION/CONCLUSIONS:Women with one or more of these three specific risk factors likely benefit from screening commencing at age 30 instead of ageÂ 40.
Current Status and Future Wish List of Peer Review: A National Questionnaire of U.S. Radiologists
OBJECTIVE. Most peer review programs focus on error detection, numeric scoring, and radiologist-specific error rates. The effectiveness of this method on learning and systematic improvement is uncertain at best. Radiologists have been pushing for a transition from an individually punitive peer review system to a peer-learning model. This national questionnaire of U.S. radiologists aims to assess the current status of peer review and opportunities for improvement. MATERIALS AND METHODS. A 21-question multiple-choice questionnaire was developed and face validity assessed by the ARRS Performance Quality Improvement subcommittee. The questionnaire was e-mailed to 17,695 ARRS members and open for 4 weeks; two e-mail reminders were sent. Response collection was anonymous. Only responses from board-certified, practicing radiologists participating in peer review were analyzed. RESULTS. The response rate was 4.2% (742/17,695), and 73.7% (547/742) met inclusion criteria. Most responders were in private practice (51.7%, 283/547) with a group size of 11-50 radiologists (50.5%) and in an urban setting (61.6%). Significant diversity was noted in peer review systems, with RADPEER used by less than half (45.0%) and cases selected most commonly by commercial software (36.2%) or manually (31.2%). There was no consensus on the number of required peer reviews per month (10-20 cases, 32.1%; > 20 cases, 29.1%; < 10 cases, 21.7%). Less than half (43.7%) did not use peer review for group education. Whereas most (67.7%) were notified of their peer review results individually, 21.5% were not notified at all. Around half were dissatisfied (44.5%) because of insufficient learning (94.0%) and inaccurate representation of their performance improvement (75.5%). Overall, the group discrepancy rates were unknown to most radiologists who participate in peer review (54.3%). Submission bias was the main reason for underreporting of serious discrepancies (49.0%). Most found four peer-learning methods feasible in daily practice: incidental observation, 65.1%; focused practice review, 52.9%; professional auditing, 45.8%; and blinded double reading, 35.4%. CONCLUSION. More than half of participants reported that peer review data are used for educational purposes. However, significant diversity remains in current peer review practice with no agreement on number of required reviews, method of case selection, and oversight of results. Nearly half of the radiologists reported insufficient learning, although most feel a better system would be feasible in daily practice.