NYUHSL Faculty Bibliography

Searched for:

person:ys1001

in-biosketch:yes

Total Results:

Research square. 2023.DOI: 10.21203/rs.3.rs-3035772/v1

Improving Information Extraction from Pathology Reports using Named Entity Recognition

Zeng, Ken G; Dutt, Tarun; Witowski, Jan; Kranthi Kiran, G V; Yeung, Frank; Kim, Michelle; Kim, Jesi; Pleasure, Mitchell; Moczulski, Christopher; Lopez, L Julian Lechuga; Zhang, Hao; Harbi, Mariam Al; Shamout, Farah E; Major, Vincent J; Heacock, Laura; Moy, Linda; Schnabel, Freya; Pak, Linda M; Shen, Yiqiu; Geras, Krzysztof J

Pathology reports are considered the gold standard in medical research due to their comprehensive and accurate diagnostic information. Natural language processing (NLP) techniques have been developed to automate information extraction from pathology reports. However, existing studies suffer from two significant limitations. First, they typically frame their tasks as report classification, which restricts the granularity of extracted information. Second, they often fail to generalize to unseen reports due to variations in language, negation, and human error. To overcome these challenges, we propose a BERT (bidirectional encoder representations from transformers) named entity recognition (NER) system to extract key diagnostic elements from pathology reports. We also introduce four data augmentation methods to improve the robustness of our model. Trained and evaluated on 1438 annotated breast pathology reports, acquired from a large medical center in the United States, our BERT model trained with data augmentation achieves an entity F1-score of 0.916 on an internal test set, surpassing the BERT baseline (0.843). We further assessed the model's generalizability using an external validation dataset from the United Arab Emirates, where our model maintained satisfactory performance (F1-score 0.860). Our findings demonstrate that our NER systems can effectively extract fine-grained information from widely diverse medical reports, offering the potential for large-scale information extraction in a wide range of medical and AI research. We publish our code at https://github.com/nyukat/pathology_extraction.

PMCID:10350195

PMID: 37461545

CID: 5588752

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2023.DOI: 10.1109/CVPR52729.2023.00327

Multiple Instance Learning via Iterative Self-Paced Supervised Contrastive Learning [Proceedings Paper]

Liu, Kangning; Zhu, Weicheng; Shen, Yiqiu; Liu, Sheng; Razavian, Narges; J. Geras, Krzysztof; Fernandez-Granda, Carlos

ORIGINAL:0017083

ISSN: 2575-7075

CID: 5573532

Radiology. 2023.DOI: 10.1148/radiol.230163

ChatGPT and Other Large Language Models Are Double-edged Swords [Editorial]

Shen, Yiqiu; Heacock, Laura; Elias, Jonathan; Hentel, Keith D; Reig, Beatriu; Shih, George; Moy, Linda

PMID: 36700838

ISSN: 1527-1315

CID: 5419662

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022.DOI: 10.1109/CVPR52688.2022.00263

Adaptive Early-Learning Correction for Segmentation from Noisy Annotations [Proceedings Paper]

Liu, Kangning; Zhu, Weicheng; Shen, Yiqiu; Liu, Sheng; Razavian, Narges; J. Geras, Krzysztof; Fernandez-Granda, Carlos

ORIGINAL:0017084

ISSN: 2575-7075

CID: 5573542

Journal of digital imaging. 2021:34(6):1414-1423.DOI: 10.1007/s10278-021-00530-6

Reducing False-Positive Biopsies using Deep Neural Networks that Utilize both Local and Global Image Context of Screening Mammograms

Wu, Nan; Huang, Zhe; Shen, Yiqiu; Park, Jungkyu; Phang, Jason; Makino, Taro; Gene Kim, S; Cho, Kyunghyun; Heacock, Laura; Moy, Linda; Geras, Krzysztof J

Breast cancer is the most common cancer in women, and hundreds of thousands of unnecessary biopsies are done around the world at a tremendous cost. It is crucial to reduce the rate of biopsies that turn out to be benign tissue. In this study, we build deep neural networks (DNNs) to classify biopsied lesions as being either malignant or benign, with the goal of using these networks as second readers serving radiologists to further reduce the number of false-positive findings. We enhance the performance of DNNs that are trained to learn from small image patches by integrating global context provided in the form of saliency maps learned from the entire image into their reasoning, similar to how radiologists consider global context when evaluating areas of interest. Our experiments are conducted on a dataset of 229,426 screening mammography examinations from 141,473 patients. We achieve an AUC of 0.8 on a test set consisting of 464 benign and 136 malignant lesions.

PMID: 34731338

ISSN: 1618-727x

CID: 5038152

Nature communications. 2021:12(1).DOI: 10.1038/s41467-021-26023-2

Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams

Shen, Yiqiu; Shamout, Farah E; Oliver, Jamie R; Witowski, Jan; Kannan, Kawshik; Park, Jungkyu; Wu, Nan; Huddleston, Connor; Wolfson, Stacey; Millet, Alexandra; Ehrenpreis, Robin; Awal, Divya; Tyma, Cathy; Samreen, Naziya; Gao, Yiming; Chhor, Chloe; Gandhi, Stacey; Lee, Cindy; Kumari-Subaiya, Sheila; Leonard, Cindy; Mohammed, Reyhan; Moczulski, Christopher; Altabet, Jaime; Babb, James; Lewin, Alana; Reig, Beatriu; Moy, Linda; Heacock, Laura; Geras, Krzysztof J

Though consistently shown to detect mammographically occult cancers, breast ultrasound has been noted to have high false-positive rates. In this work, we present an AI system that achieves radiologist-level accuracy in identifying breast cancer in ultrasound images. Developed on 288,767 exams, consisting of 5,442,907 B-mode and Color Doppler images, the AI achieves an area under the receiver operating characteristic curve (AUROC) of 0.976 on a test set consisting of 44,755 exams. In a retrospective reader study, the AI achieves a higher AUROC than the average of ten board-certified breast radiologists (AUROC: 0.962 AI, 0.924â€‰Â±â€‰0.02 radiologists). With the help of the AI, radiologists decrease their false positive rates by 37.3% and reduce requested biopsies by 27.8%, while maintaining the same level of sensitivity. This highlights the potential of AI in improving the accuracy, consistency, and efficiency of breast ultrasound diagnosis.

PMCID:8463596

PMID: 34561440

ISSN: 2041-1723

CID: 5039442

Proceedings of machine learning research. 2021:143:268-285.DOI:

Weakly-supervised High-resolution Segmentation of Mammography Images for Breast Cancer Diagnosis

Liu, Kangning; Shen, Yiqiu; Wu, Nan; ChÅ‚Ä™dowski, Jakub; Fernandez-Granda, Carlos; Geras, Krzysztof J

In the last few years, deep learning classifiers have shown promising results in image-based medical diagnosis. However, interpreting the outputs of these models remains a challenge. In cancer diagnosis, interpretability can be achieved by localizing the region of the input image responsible for the output, i.e. the location of a lesion. Alternatively, segmentation or detection models can be trained with pixel-wise annotations indicating the locations of malignant lesions. Unfortunately, acquiring such labels is labor-intensive and requires medical expertise. To overcome this difficulty, weakly-supervised localization can be utilized. These methods allow neural network classifiers to output saliency maps highlighting the regions of the input most relevant to the classification task (e.g. malignant lesions in mammograms) using only image-level labels (e.g. whether the patient has cancer or not) during training. When applied to high-resolution images, existing methods produce low-resolution saliency maps. This is problematic in applications in which suspicious lesions are small in relation to the image size. In this work, we introduce a novel neural network architecture to perform weakly-supervised segmentation of high-resolution images. The proposed model selects regions of interest via coarse-level localization, and then performs fine-grained segmentation of those regions. We apply this model to breast cancer diagnosis with screening mammography, and validate it on a large clinically-realistic dataset. Measured by Dice similarity score, our approach outperforms existing methods by a large margin in terms of localization performance of benign and malignant lesions, relatively improving the performance by 39.6% and 20.0%, respectively. Code and the weights of some of the models are available at https://github.com/nyukat/GLAM.

PMCID:8791642

PMID: 35088055

ISSN: 2640-3498

CID: 5154792

NPJ digital medicine. 2021:4(1).DOI: 10.1038/s41746-021-00453-0

An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department

Shamout, Farah E; Shen, Yiqiu; Wu, Nan; Kaku, Aakash; Park, Jungkyu; Makino, Taro; JastrzÄ™bski, StanisÅ‚aw; Witowski, Jan; Wang, Duo; Zhang, Ben; Dogra, Siddhant; Cao, Meng; Razavian, Narges; Kudlowitz, David; Azour, Lea; Moore, William; Lui, Yvonne W; Aphinyanaphongs, Yindalon; Fernandez-Granda, Carlos; Geras, Krzysztof J

During the coronavirus disease 2019 (COVID-19) pandemic, rapid and accurate triage of patients at the emergency department is critical to inform decision-making. We propose a data-driven approach for automatic prediction of deterioration risk using a deep neural network that learns from chest X-ray images and a gradient boosting model that learns from routine clinical variables. Our AI prognosis system, trained using data from 3661 patients, achieves an area under the receiver operating characteristic curve (AUC) of 0.786 (95% CI: 0.745-0.830) when predicting deterioration within 96â€‰hours. The deep neural network extracts informative areas of chest X-ray images to assist clinicians in interpreting the predictions and performs comparably to two radiologists in a reader study. In order to verify performance in a real clinical setting, we silently deployed a preliminary version of the deep neural network at New York University Langone Health during the first wave of the pandemic, which produced accurate predictions in real-time. In summary, our findings demonstrate the potential of the proposed system for assisting front-line physicians in the triage of COVID-19 patients.

PMID: 33980980

ISSN: 2398-6352

CID: 4867572

Medical image analysis. 2020:68.DOI: 10.1016/j.media.2020.101908

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization

Shen, Yiqiu; Wu, Nan; Phang, Jason; Park, Jungkyu; Liu, Kangning; Tyagi, Sudarshini; Heacock, Laura; Kim, S Gene; Moy, Linda; Cho, Kyunghyun; Geras, Krzysztof J

Medical images differ from natural images in significantly higher resolutions and smaller regions of interest. Because of these differences, neural network architectures that work well for natural images might not be applicable to medical image analysis. In this work, we propose a novel neural network model to address these unique properties of medical images. This model first uses a low-capacity, yet memory-efficient, network on the whole image to identify the most informative regions. It then applies another higher-capacity network to collect details from chosen regions. Finally, it employs a fusion module that aggregates global and local information to make a prediction. While existing methods often require lesion segmentation during training, our model is trained with only image-level labels and can generate pixel-level saliency maps indicating possible malignant findings. We apply the model to screening mammography interpretation: predicting the presence or absence of benign and malignant lesions. On the NYU Breast Cancer Screening Dataset, our model outperforms (AUCÂ =Â 0.93) ResNet-34 and Faster R-CNN in classifying breasts with malignant findings. On the CBIS-DDSM dataset, our model achieves performance (AUCÂ =Â 0.858) on par with state-of-the-art approaches. Compared to ResNet-34, our model is 4.1x faster for inference while using 78.4% less GPU memory. Furthermore, we demonstrate, in a reader study, that our model surpasses radiologist-level AUC by a margin of 0.11.

PMID: 33383334

ISSN: 1361-8423

CID: 4759232

Radiology. 2020.DOI: 10.1148/radiol.2020192091

Prediction of Total Knee Replacement and Diagnosis of Osteoarthritis by Using Deep Learning on Knee Radiographs: Data from the Osteoarthritis Initiative

Leung, Kevin; Zhang, Bofei; Tan, Jimin; Shen, Yiqiu; Geras, Krzysztof J; Babb, James S; Cho, Kyunghyun; Chang, Gregory; Deniz, Cem M

Background The methods for assessing knee osteoarthritis (OA) do not provide enough comprehensive information to make robust and accurate outcome predictions. Purpose To develop a deep learning (DL) prediction model for risk of OA progression by using knee radiographs in patients who underwent total knee replacement (TKR) and matched control patients who did not undergo TKR. Materials and Methods In this retrospective analysis that used data from the OA Initiative, a DL model on knee radiographs was developed to predict both the likelihood of a patient undergoing TKR within 9 years and Kellgren-Lawrence (KL) grade. Study participants included a case-control matched subcohort between 45 and 79 years. Patients were matched to control patients according to age, sex, ethnicity, and body mass index. The proposed model used a transfer learning approach based on the ResNet34 architecture with sevenfold nested cross-validation. Receiver operating characteristic curve analysis and conditional logistic regression assessed model performance for predicting probability and risk of TKR compared with clinical observations and two binary outcome prediction models on the basis of radiographic readings: KL grade and OA Research Society International (OARSI) grade. Results Evaluated were 728 participants including 324 patients (mean age, 64 years Â± 8 [standard deviation]; 222 women) and 324 control patients (mean age, 64 years Â± 8; 222 women). The prediction model based on DL achieved an area under the receiver operating characteristic curve (AUC) of 0.87 (95% confidence interval [CI]: 0.85, 0.90), outperforming a baseline prediction model by using KL grade with an AUC of 0.74 (95% CI: 0.71, 0.77; P < .001). The risk for TKR increased with probability that a person will undergo TKR from the DL model (odds ratio [OR], 7.7; 95% CI: 2.3, 25; P < .001), KL grade (OR, 1.92; 95% CI: 1.17, 3.13; P = .009), and OARSI grade (OR, 1.20; 95% CI: 0.41, 3.50; P = .73). Conclusion The proposed deep learning model better predicted risk of total knee replacement in osteoarthritis than did binary outcome models by using standard grading systems. Â©â€‰RSNA, 2020 Online supplemental material is available for this article. See also the editorial by Richardson in this issue.

PMID: 32573386

ISSN: 1527-1315

CID: 4492992