NYUHSL Faculty Bibliography

Searched for:

in-biosketch:yes

person:ys1001

Total Results:

Journal of the American College of Radiology : JACR. 2025.DOI: 10.1016/j.jacr.2025.12.024

Evaluating Generative Artificial Intelligence as an Educational Tool for Radiology Resident Report Drafting

Verdone, Antonio; Cardall, Aidan; Siddiqui, Fardeen; Nashawaty, Motaz; Rigau, Danielle; Kwon, Youngjoon; Yousef, Mira; Patel, Shalin; Kieturakis, Alex; Kim, Eric; Heacock, Laura; Reig, Beatriu; Shen, Yiqiu

OBJECTIVE:Radiology residents require timely, personalized feedback to develop accurate image analysis and reporting skills. Increasing clinical workload often limits attendings' ability to provide guidance. This study evaluates a HIPAA-compliant Generative Pretrained Transformer (GPT)-4o system that delivers automated feedback on breast imaging reports drafted by residents in real clinical settings. METHODS:We analyzed 5,000 resident-attending report pairs from routine practice at a multisite US health system. GPT-4o was prompted with clinical instructions to identify common errors and provide feedback. A reader study using 100 report pairs was conducted. Four attending radiologists and four residents independently reviewed each pair, determined whether predefined error types were present, and rated GPT-4o's feedback as helpful or not. Agreement between GPT and readers was assessed using percent match. Interreader reliability was measured with Krippendorff's α. Educational value was measured as the proportion of cases rated helpful. RESULTS:Three common error types were identified: (1) omission or addition of key findings, (2) incorrect use or omission of technical descriptors, and (3) final assessment inconsistent with findings. GPT-4o showed strong agreement with attending consensus: 90.5%, 78.3%, and 90.4% (Cohen's κ: 0.790, 0.550, and 0.615) across error types. Interreader reliability among all eight readers showed moderate to substantial variability (α = 0.767, 0.595, 0.567). When each reader was individually replaced with GPT-4o and interreader agreement among seven readers and GPT was recalculated, the effect was not statistically significant (Δ = -0.004 to 0.002, all P > .05). GPT's feedback was rated helpful in most cases: 89.8%, 83.0%, and 92.0%. DISCUSSION/CONCLUSIONS:ChatGPT-4o can reliably identify key educational errors. It may serve as a scalable tool to support radiology education.

PMCID:12869900

PMID: 41453630

ISSN: 1558-349x

CID: 6005882

Abdominal radiology. 2025.DOI: 10.1007/s00261-025-05230-1

Patient and lesion characteristics associated with follow-up completion for pancreatic cystic lesions detected on MRI

Huang, Chenchan; Thakore, Nitya L; Shen, Yiqiu; Rasromani, Ebrahim K; Saba, Bryce A; Levine, Jonah M; Jacobi, Sophia M; Chen, Runhan; Pan, Hengkai; Kang, Stella K

PURPOSE/OBJECTIVE:To evaluate the association of patient characteristics, community-level social determinants of health, and cyst risk categories with completion of follow-up recommendations for incidental Pancreatic Cystic Lesions (PCLs). METHODS:We retrospectively identified consecutive patients (2013-2023) whose MRI radiology reports described PCLs. A fine-tuned LLaMA-3.1 8B Instruct large language model was used to extract PCL features. Lesions were classified using the 2017 ACR white paper: Category 1 (low risk), Category 2 (worrisome features), or Category 3 (high-risk stigmata). We recorded demographics and follow-up imaging or endoscopic ultrasound dates. Community-level factors were characterized by the 2020 CDC Social Vulnerability Index (SVI), stratified into quartiles. The primary outcome, "inappropriate follow-up," combined late and no follow-up. Multivariable binomial regression was applied to evaluate associations with inappropriate follow-up. RESULTS:In 7,745 patients (mean age 66.3 years; 4,796 women), 92.9% (7,198/7,745) of cysts were Category 1, 6.4% (498/7,745) were Category 2, and 0.6% (49/7,745) were Category 3. Only 36.3% of patients completed appropriate follow-up, 12.1% were late, and 51.6% were lost to follow-up. Inappropriate follow-up was high in every cyst category: 64.2% in Category 1, 59.4% in Category 2 and 49.0% in Category 3. In multivariable analysis, non-English primary language (RR 1.08; 95% CI, 1.02-1.14) and residing in more vulnerable communities of the 3rd quartiles of the socioeconomic Social Vulnerability Index subcategory (RR 1.07; 95% CI, 1.02-1.12) were associated with inappropriate follow-up. Higher age-adjusted Charlson Comorbidity Index (CCI ≥ 4) (RR .84; 95% CI, .79-.88), CCI 2-3 (RR .84; 95% CI, .79-.88), and higher-risk cysts in patients under 65 years of age (RR .76; 95% CI, .65-.89) were associated with completed follow-up. CONCLUSION/CONCLUSIONS:Follow-up completion for incidental PCLs was low. Factors most consistently associated with follow-up completion were language barriers, residence in socioeconomically vulnerable communities, age-adjusted CCI and higher-risk features among those under 65 years.

PMID: 41134364

ISSN: 2366-0058

CID: 5957362

Abdominal radiology. 2025:50(6):2745-2757.DOI: 10.1007/s00261-024-04708-8

Multi-modal large language models in radiology: principles, applications, and potential

Shen, Yiqiu; Xu, Yanqi; Ma, Jiajian; Rui, Wushuang; Zhao, Chen; Heacock, Laura; Huang, Chenchan

Large language models (LLMs) and multi-modal large language models (MLLMs) represent the cutting-edge in artificial intelligence. This review provides a comprehensive overview of their capabilities and potential impact on radiology. Unlike most existing literature reviews focusing solely on LLMs, this work examines both LLMs and MLLMs, highlighting their potential to support radiology workflows such as report generation, image interpretation, EHR summarization, differential diagnosis generation, and patient education. By streamlining these tasks, LLMs and MLLMs could reduce radiologist workload, improve diagnostic accuracy, support interdisciplinary collaboration, and ultimately enhance patient care. We also discuss key limitations, such as the limited capacity of current MLLMs to interpret 3D medical images and to integrate information from both image and text data, as well as the lack of effective evaluation methods. Ongoing efforts to address these challenges are introduced.

PMID: 39621074

ISSN: 2366-0058

CID: 5780062

Abdominal radiology. 2025:50(4):1731-1743.DOI: 10.1007/s00261-024-04644-7

Advancements in early detection of pancreatic cancer: the role of artificial intelligence and novel imaging techniques

Huang, Chenchan; Shen, Yiqiu; Galgano, Samuel J; Goenka, Ajit H; Hecht, Elizabeth M; Kambadakone, Avinash; Wang, Zhen Jane; Chu, Linda C

Early detection is crucial for improving survival rates of pancreatic ductal adenocarcinoma (PDA), yet current diagnostic methods can often fail at this stage. Recently, there has been significant interest in improving risk stratification and developing imaging biomarkers, through novel imaging techniques, and most notably, artificial intelligence (AI) technology. This review provides an overview of these advancements, with a focus on deep learning methods for early detection of PDA.

PMID: 39467913

ISSN: 2366-0058

CID: 5746802

[Zhong ji yi kan] = [Medicine for intermediate groups]. 2023.DOI: 10.21203/rs.3.rs-3035772/v1

Improving Information Extraction from Pathology Reports using Named Entity Recognition

Zeng, Ken G; Dutt, Tarun; Witowski, Jan; Kranthi Kiran, G V; Yeung, Frank; Kim, Michelle; Kim, Jesi; Pleasure, Mitchell; Moczulski, Christopher; Lopez, L Julian Lechuga; Zhang, Hao; Harbi, Mariam Al; Shamout, Farah E; Major, Vincent J; Heacock, Laura; Moy, Linda; Schnabel, Freya; Pak, Linda M; Shen, Yiqiu; Geras, Krzysztof J

Pathology reports are considered the gold standard in medical research due to their comprehensive and accurate diagnostic information. Natural language processing (NLP) techniques have been developed to automate information extraction from pathology reports. However, existing studies suffer from two significant limitations. First, they typically frame their tasks as report classification, which restricts the granularity of extracted information. Second, they often fail to generalize to unseen reports due to variations in language, negation, and human error. To overcome these challenges, we propose a BERT (bidirectional encoder representations from transformers) named entity recognition (NER) system to extract key diagnostic elements from pathology reports. We also introduce four data augmentation methods to improve the robustness of our model. Trained and evaluated on 1438 annotated breast pathology reports, acquired from a large medical center in the United States, our BERT model trained with data augmentation achieves an entity F1-score of 0.916 on an internal test set, surpassing the BERT baseline (0.843). We further assessed the model's generalizability using an external validation dataset from the United Arab Emirates, where our model maintained satisfactory performance (F1-score 0.860). Our findings demonstrate that our NER systems can effectively extract fine-grained information from widely diverse medical reports, offering the potential for large-scale information extraction in a wide range of medical and AI research. We publish our code at https://github.com/nyukat/pathology_extraction.

PMCID:10350195

PMID: 37461545

CID: 5588752

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2023.DOI: 10.1109/CVPR52729.2023.00327

Multiple Instance Learning via Iterative Self-Paced Supervised Contrastive Learning [Proceedings Paper]

Liu, Kangning; Zhu, Weicheng; Shen, Yiqiu; Liu, Sheng; Razavian, Narges; J. Geras, Krzysztof; Fernandez-Granda, Carlos

ORIGINAL:0017083

ISSN: 2575-7075

CID: 5573532

Radiology. 2023.DOI: 10.1148/radiol.230163

ChatGPT and Other Large Language Models Are Double-edged Swords [Editorial]

Shen, Yiqiu; Heacock, Laura; Elias, Jonathan; Hentel, Keith D; Reig, Beatriu; Shih, George; Moy, Linda

PMID: 36700838

ISSN: 1527-1315

CID: 5419662

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022.DOI: 10.1109/CVPR52688.2022.00263

Adaptive Early-Learning Correction for Segmentation from Noisy Annotations [Proceedings Paper]

Liu, Kangning; Zhu, Weicheng; Shen, Yiqiu; Liu, Sheng; Razavian, Narges; J. Geras, Krzysztof; Fernandez-Granda, Carlos

ORIGINAL:0017084

ISSN: 2575-7075

CID: 5573542

Journal of digital imaging. 2021:34(6):1414-1423.DOI: 10.1007/s10278-021-00530-6

Reducing False-Positive Biopsies using Deep Neural Networks that Utilize both Local and Global Image Context of Screening Mammograms

Wu, Nan; Huang, Zhe; Shen, Yiqiu; Park, Jungkyu; Phang, Jason; Makino, Taro; Gene Kim, S; Cho, Kyunghyun; Heacock, Laura; Moy, Linda; Geras, Krzysztof J

Breast cancer is the most common cancer in women, and hundreds of thousands of unnecessary biopsies are done around the world at a tremendous cost. It is crucial to reduce the rate of biopsies that turn out to be benign tissue. In this study, we build deep neural networks (DNNs) to classify biopsied lesions as being either malignant or benign, with the goal of using these networks as second readers serving radiologists to further reduce the number of false-positive findings. We enhance the performance of DNNs that are trained to learn from small image patches by integrating global context provided in the form of saliency maps learned from the entire image into their reasoning, similar to how radiologists consider global context when evaluating areas of interest. Our experiments are conducted on a dataset of 229,426 screening mammography examinations from 141,473 patients. We achieve an AUC of 0.8 on a test set consisting of 464 benign and 136 malignant lesions.

PMID: 34731338

ISSN: 1618-727x

CID: 5038152

Nature communications. 2021:12(1).DOI: 10.1038/s41467-021-26023-2

Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams

Shen, Yiqiu; Shamout, Farah E; Oliver, Jamie R; Witowski, Jan; Kannan, Kawshik; Park, Jungkyu; Wu, Nan; Huddleston, Connor; Wolfson, Stacey; Millet, Alexandra; Ehrenpreis, Robin; Awal, Divya; Tyma, Cathy; Samreen, Naziya; Gao, Yiming; Chhor, Chloe; Gandhi, Stacey; Lee, Cindy; Kumari-Subaiya, Sheila; Leonard, Cindy; Mohammed, Reyhan; Moczulski, Christopher; Altabet, Jaime; Babb, James; Lewin, Alana; Reig, Beatriu; Moy, Linda; Heacock, Laura; Geras, Krzysztof J

Though consistently shown to detect mammographically occult cancers, breast ultrasound has been noted to have high false-positive rates. In this work, we present an AI system that achieves radiologist-level accuracy in identifying breast cancer in ultrasound images. Developed on 288,767 exams, consisting of 5,442,907 B-mode and Color Doppler images, the AI achieves an area under the receiver operating characteristic curve (AUROC) of 0.976 on a test set consisting of 44,755 exams. In a retrospective reader study, the AI achieves a higher AUROC than the average of ten board-certified breast radiologists (AUROC: 0.962 AI, 0.924â€‰Â±â€‰0.02 radiologists). With the help of the AI, radiologists decrease their false positive rates by 37.3% and reduce requested biopsies by 27.8%, while maintaining the same level of sensitivity. This highlights the potential of AI in improving the accuracy, consistency, and efficiency of breast ultrasound diagnosis.

PMCID:8463596

PMID: 34561440

ISSN: 2041-1723

CID: 5039442