Try a new search

Format these results:

Searched for:



Total Results:


An efficient deep neural network to classify large 3D images with small objects

Park, Jungkyu; Chledowski, Jakub; Jastrzebski, Stanislaw; Witowski, Jan; Xu, Yanqi; Du, Linda; Gaddam, Sushma; Kim, Eric; Lewin, Alana; Parikh, Ujas; Plaunova, Anastasia; Chen, Sardius; Millet, Alexandra; Park, James; Pysarenko, Kristine; Patel, Shalin; Goldberg, Julia; Wegener, Melanie; Moy, Linda; Heacock, Laura; Reig, Beatriu; Geras, Krzysztof J
3D imaging enables accurate diagnosis by providing spatial information about organ anatomy. However, using 3D images to train AI models is computationally challenging because they consist of 10x or 100x more pixels than their 2D counterparts. To be trained with high-resolution 3D images, convolutional neural networks resort to downsampling them or projecting them to 2D. We propose an effective alternative, a neural network that enables efficient classification of full-resolution 3D medical images. Compared to off-the-shelf convolutional neural networks, our network, 3D Globally-Aware Multiple Instance Classifier (3D-GMIC), uses 77.98%-90.05% less GPU memory and 91.23%-96.02% less computation. While it is trained only with image-level labels, without segmentation labels, it explains its predictions by providing pixel-level saliency maps. On a dataset collected at NYU Langone Health, including 85,526 patients with full-field 2D mammography (FFDM), synthetic 2D mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0.831 (95% CI: 0.769-0.887) in classifying breasts with malignant findings using 3D mammography. This is comparable to the performance of GMIC on FFDM (0.816, 95% CI: 0.737-0.878) and synthetic 2D (0.826, 95% CI: 0.754-0.884), which demonstrates that 3D-GMIC successfully classified large 3D images despite focusing computation on a smaller percentage of its input compared to GMIC. Therefore, 3D-GMIC identifies and utilizes extremely small regions of interest from 3D images consisting of hundreds of millions of pixels, dramatically reducing associated computational challenges. 3D-GMIC generalizes well to BCS-DBT, an external dataset from Duke University Hospital, achieving an AUC of 0.848 (95% CI: 0.798-0.896).
PMID: 37590109
ISSN: 1558-254x
CID: 5588742

Improving breast cancer diagnostics with deep learning for MRI

Witowski, Jan; Heacock, Laura; Reig, Beatriu; Kang, Stella K; Lewin, Alana; Pysarenko, Kristine; Patel, Shalin; Samreen, Naziya; Rudnicki, Wojciech; Łuczyńska, Elżbieta; Popiela, Tadeusz; Moy, Linda; Geras, Krzysztof J
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has a high sensitivity in detecting breast cancer but often leads to unnecessary biopsies and patient workup. We used a deep learning (DL) system to improve the overall accuracy of breast cancer diagnosis and personalize management of patients undergoing DCE-MRI. On the internal test set (n = 3936 exams), our system achieved an area under the receiver operating characteristic curve (AUROC) of 0.92 (95% CI: 0.92 to 0.93). In a retrospective reader study, there was no statistically significant difference (P = 0.19) between five board-certified breast radiologists and the DL system (mean ΔAUROC, +0.04 in favor of the DL system). Radiologists' performance improved when their predictions were averaged with DL's predictions [mean ΔAUPRC (area under the precision-recall curve), +0.07]. We demonstrated the generalizability of the DL system using multiple datasets from Poland and the United States. An additional reader study on a Polish dataset showed that the DL system was as robust to distribution shift as radiologists. In subgroup analysis, we observed consistent results across different cancer subtypes and patient demographics. Using decision curve analysis, we showed that the DL system can reduce unnecessary biopsies in the range of clinically relevant risk thresholds. This would lead to avoiding biopsies yielding benign results in up to 20% of all patients with BI-RADS category 4 lesions. Last, we performed an error analysis, investigating situations where DL predictions were mostly incorrect. This exploratory work creates a foundation for deployment and prospective analysis of DL-based models for breast MRI.
PMID: 36170446
ISSN: 1946-6242
CID: 5334352

Differences between human and machine perception in medical diagnosis

Makino, Taro; Jastrzębski, Stanisław; Oleszkiewicz, Witold; Chacko, Celin; Ehrenpreis, Robin; Samreen, Naziya; Chhor, Chloe; Kim, Eric; Lee, Jiyon; Pysarenko, Kristine; Reig, Beatriu; Toth, Hildegard; Awal, Divya; Du, Linda; Kim, Alice; Park, James; Sodickson, Daniel K; Heacock, Laura; Moy, Linda; Cho, Kyunghyun; Geras, Krzysztof J
Deep neural networks (DNNs) show promise in image-based medical diagnosis, but cannot be fully trusted since they can fail for reasons unrelated to underlying pathology. Humans are less likely to make such superficial mistakes, since they use features that are grounded on medical science. It is therefore important to know whether DNNs use different features than humans. Towards this end, we propose a framework for comparing human and machine perception in medical diagnosis. We frame the comparison in terms of perturbation robustness, and mitigate Simpson's paradox by performing a subgroup analysis. The framework is demonstrated with a case study in breast cancer screening, where we separately analyze microcalcifications and soft tissue lesions. While it is inconclusive whether humans and DNNs use different features to detect microcalcifications, we find that for soft tissue lesions, DNNs rely on high frequency components ignored by radiologists. Moreover, these features are located outside of the region of the images found most suspicious by radiologists. This difference between humans and machines was only visible through subgroup analysis, which highlights the importance of incorporating medical domain knowledge into the comparison.
PMID: 35477730
ISSN: 2045-2322
CID: 5205672

Lessons from the first DBTex Challenge

Park, Jungkyu; Shoshan, Yoel; Marti, Robert; Gómez del Campo, Pablo; Ratner, Vadim; Khapun, Daniel; Zlotnick, Aviad; Barkan, Ella; Gilboa-Solomon, Flora; ChÅ‚Ä™dowski, Jakub; Witowski, Jan; Millet, Alexandra; Kim, Eric; Lewin, Alana; Pysarenko, Kristine; Chen, Sardius; Goldberg, Julia; Patel, Shalin; Plaunova, Anastasia; Wegener, Melanie; Wolfson, Stacey; Lee, Jiyon; Hava, Sana; Murthy, Sindhoora; Du, Linda; Gaddam, Sushma; Parikh, Ujas; Heacock, Laura; Moy, Linda; Reig, Beatriu; Rosen-Zvi, Michal; Geras, Krzysztof J.
ISSN: 2522-5839
CID: 5000532

Comparison between qualitative and quantitative assessment of background parenchymal enhancement on breast MRI

Pujara, Akshat C; Mikheev, Artem; Rusinek, Henry; Gao, Yiming; Chhor, Chloe; Pysarenko, Kristine; Rallapalli, Harikrishna; Walczyk, Jerzy; Moccaldi, Melanie; Babb, James S; Melsaether, Amy N
BACKGROUND: Potential clinical implications of the level of background parenchymal enhancement (BPE) on breast MRI are increasing. Currently, BPE is typically evaluated subjectively. Tests of concordance between subjective BPE assessment and computer-assisted quantified BPE have not been reported. PURPOSE OR HYPOTHESIS: To compare subjective radiologist assessment of BPE with objective quantified parenchymal enhancement (QPE). STUDY TYPE: Cross-sectional observational study. POPULATION: Between 7/24/2015 and 11/27/2015, 104 sequential patients (ages 23 - 81 years, mean 49 years) without breast cancer underwent breast MRI and were included in this study. FIELD STRENGTH/SEQUENCE: 3T; fat suppressed axial T2, axial T1, and axial fat suppressed T1 before and after intravenous contrast. ASSESSMENT: Four breast imagers graded BPE at 90 and 180 s after contrast injection on a 4-point scale (a-d). Fibroglandular tissue masks were generated using a phantom-validated segmentation algorithm, and were co-registered to pre- and postcontrast fat suppressed images to define the region of interest. QPE was calculated. STATISTICAL TESTS: Receiver operating characteristic (ROC) analyses and kappa coefficients (k) were used to compare subjective BPE with QPE. RESULTS: ROC analyses indicated that subjective BPE at 90 s was best predicted by quantified QPE 50.0 = d, and at 180 s by quantified QPE 74.5 = d. Agreement between subjective BPE and QPE was slight to fair at 90 s (k = 0.20-0.36) and 180 s (k = 0.19-0.28). At higher levels of QPE, agreement between subjective BPE and QPE significantly decreased for all four radiologists at 90 s (P
PMID: 29140576
ISSN: 1522-2586
CID: 2785262

Structured Reporting: A Tool to Improve Reimbursement

Pysarenko, Kristine; Recht, Michael; Kim, Danny
PMID: 28027857
ISSN: 1558-349x
CID: 2383582

Clinical applicability and relevance of fibroglandular tissue segmentation on routine T1 weighted breast MRI

Pujara, Akshat C; Mikheev, Artem; Rusinek, Henry; Rallapalli, Harikrishna; Walczyk, Jerzy; Gao, Yiming; Chhor, Chloe; Pysarenko, Kristine; Babb, James S; Melsaether, Amy N
PURPOSE: To evaluate clinical applicability of fibroglandular tissue (FGT) segmentation on routine T1 weighted breast MRI and compare FGT quantification with radiologist assessment. METHODS: FGT was segmented on 232 breasts and quantified, and was assessed qualitatively by four breast imagers. RESULTS: FGT segmentation was successful in all 232 breasts. Agreement between radiologists and quantified FGT was moderate to substantial (kappa=0.52-0.67); lower quantified FGT was associated with disagreement between radiologists and quantified FGT (P
PMID: 27951458
ISSN: 1873-4499
CID: 2363342

Background parenchymal enhancement over exam time in patients with and without breast cancer

Melsaether, Amy; Pujara, Akshat C; Elias, Kristin; Pysarenko, Kristine; Gudi, Anjali; Dodelzon, Katerina; Babb, James S; Gao, Yiming; Moy, Linda
PURPOSE: To compare background parenchymal enhancement (BPE) over time in patients with and without breast cancer. MATERIALS AND METHODS: This retrospective Institutional Review Board (IRB)-approved, Health Insurance Portability and Accountability Act (HIPAA)-compliant study included 116 women (25-84 years, mean 54 years) with breast cancer who underwent breast magnetic resonance imaging at 3T between 1/2/2009 and 12/29/2009 and 116 age and date-of-exam-matched women without breast cancer (23-84 years, mean 51 years). Two independent, blinded readers (R1, R2) recorded BPE (minimal, mild, moderate, marked) at three times (100, 210, and 320 seconds postcontrast). Subsequent cancers were diagnosed in 9/96 control patients with follow up (12.6-93.0 months, mean 63.6 months). Exact Mann-Whitney, Fisher's exact, and McNemar tests were performed. RESULTS: Mean BPE was not found to be different between patients with and without breast cancer at any time (P = 0.36-0.64). At time 2 as compared with time 1, there were significantly more patients, both with and without breast cancer, with BPE >minimal (R1: 90 vs. 41 [P < 0.001] and 81 vs. 36 [P < 0.001]; R2: 84 vs. 52 [P < 0.001] and 79 vs. 43 [P < 0.001]) and BPE >mild (R1: 59 vs. 10 [P < 0.001] and 47 vs. 13 [P < 0.001]; R2: 49 vs. 12 [P < 0.001] and 41 vs. 18 [P < 0.001]). BPE changes between times 2 and 3 were not significant (P = 0.083-1.0). Odds ratios for control patients developing breast cancer were significant only for R2 and ranged up to 7.67 (1.49, 39.5; P < 0.01) for BPE >mild at time 2. CONCLUSION: BPE changes between the first and second postcontrast scans and stabilizes thereafter in most patients. Further investigation into the most clinically relevant timepoint for BPE assessment is warranted. J. Magn. Reson. Imaging 2016.
PMID: 27285396
ISSN: 1522-2586
CID: 2136622

The Patient Experience in Radiology: Observations From Over 3,500 Patient Feedback Reports in a Single Institution

Rosenkrantz, Andrew B; Pysarenko, Kristine
PURPOSE: To identify factors associated with the patient experience in radiology based on patient feedback reports from a single institution. METHODS: In a departmental patient experience committee initiative, all imaging outpatients are provided names and roles of all departmental employees with whom they interact, along with contact information for providing feedback after their appointment. All resulting feedback was recorded in a web-based database. A total of 3,675 patient comments over a 3-year period were assessed in terms of major themes. Roles of employees recognized within the patient comments were also assessed. RESULTS: Patient feedback comments most commonly related to professional staff behavior (74.5%) and wait times (11.9%), and less commonly related to a spectrum of other issues (comfort during the exam, quality of the facilities, access to information regarding the exam, patient privacy, medical records, the radiology report, billing). The most common attributes relating to staff behavior involved patients' perceptions of staff caring, professionalism, pleasantness, helpfulness, and efficiency. Employees most commonly recognized by the comments were the technologist (50.2%) and receptionist (31.6%) and much less often the radiologist (2.2%). No radiologist was in the top 10% of employees in terms of the number of comments received. CONCLUSION: Patients' comments regarding their experiences in undergoing radiologic imaging were largely influenced by staff behavior and communication (particularly relating to technologists and receptionists), as well as wait times, with radiologists having a far lesser immediate impact. Radiologists are encouraged to engage in activities that promote direct visibility to their patients and thereby combat risks of the perceived "invisible" radiologist.
PMID: 27318577
ISSN: 1558-349x
CID: 2158982

What Do Patients Tweet About Their Mammography Experience?

Rosenkrantz, Andrew B; Labib, Anthony; Pysarenko, Kristine; Prabhu, Vinay
RATIONALE AND OBJECTIVE: The purpose of this study was to evaluate themes related to patients' experience in undergoing mammography, as expressed on Twitter. METHODS: A total of 464 tweets from July to December 2015 containing the hashtag #mammogram and relating to a patient's experience in undergoing mammography were reviewed. RESULTS: Of the tweets, 45.5% occurred before the mammogram compared to 49.6% that occurred afterward (remainder of tweets indeterminate). However, in patients undergoing their first mammogram, 32.8% occurred before the examination, whereas in those undergoing follow-up mammogram, 53.0% occurred before the examination. Identified themes included breast compression (24.4%), advising other patients to undergo screening (23.9%), recognition of the health importance of the examination (18.8%), the act of waiting (10.1%), relief regarding results (9.7%), reflection that the examination was not that bad (9.1%), generalized apprehension regarding the examination (8.2%), interactions with staff (8.0%), the gown (5.0%), examination costs or access (3.4%), offering or reaching out for online support from other patients (3.2%), perception of screening as a sign of aging (2.4%), and the waiting room or waiting room amenities (1.3%). Of the tweets, 31.9% contained humor, of which 56.1% related to compression. Themes that were more common in patients undergoing their first, rather than follow-up, mammogram included breast compression (16.4% vs 9.1%, respectively) and that the test was not that bad (26.2% vs 7.6%, respectively). CONCLUSION: Online social media provides a platform for women to share their experiences and reactions in undergoing mammography, including humor, positive reflections, and encouragement of others to undergo the examination. Social media thus warrants further evaluation as a potential tool to help foster greater adherence to screening guidelines.
PMID: 27658329
ISSN: 1878-4046
CID: 2254922