NYUHSL Faculty Bibliography

Searched for:

person:nsr3

in-biosketch:yes

Total Results:

NPJ digital medicine. 2020:3.DOI: 10.1038/s41746-020-00343-x

A validated, real-time prediction model for favorable outcomes in hospitalized COVID-19 patients

Razavian, Narges; Major, Vincent J; Sudarshan, Mukund; Burk-Rafel, Jesse; Stella, Peter; Randhawa, Hardev; Bilaloglu, Seda; Chen, Ji; Nguy, Vuthy; Wang, Walter; Zhang, Hao; Reinstein, Ilan; Kudlowitz, David; Zenger, Cameron; Cao, Meng; Zhang, Ruina; Dogra, Siddhant; Harish, Keerthi B; Bosworth, Brian; Francois, Fritz; Horwitz, Leora I; Ranganath, Rajesh; Austrian, Jonathan; Aphinyanaphongs, Yindalon

The COVID-19 pandemic has challenged front-line clinical decision-making, leading to numerous published prognostic tools. However, few models have been prospectively validated and none report implementation in practice. Here, we use 3345 retrospective and 474 prospective hospitalizations to develop and validate a parsimonious model to identify patients with favorable outcomes within 96â€‰h of a prediction, based on real-time lab values, vital signs, and oxygen support variables. In retrospective and prospective validation, the model achieves high average precision (88.6% 95% CI: [88.4-88.7] and 90.8% [90.8-90.8]) and discrimination (95.1% [95.1-95.2] and 86.8% [86.8-86.9]) respectively. We implemented and integrated the model into the EHR, achieving a positive predictive value of 93.3% with 41% sensitivity. Preliminary results suggest clinicians are adopting these scores into their clinical workflows.

PMCID:7538971

PMID: 33083565

ISSN: 2398-6352

CID: 4640992

arXiv. 2020.DOI:

An artificial intelligence system for predicting the deterioration of COVID-19 patients in the emergency department [PrePrint]

Shamout, Farah E; Shen, Yiqiu; Wu, Nan; Kaku, Aakash; Park, Jungkyu; Makino, Taro; JastrzÄ™bski, StanisÅ‚aw; Wang, Duo; Zhang, Ben; Dogra, Siddhant; Cao, Meng; Razavian, Narges; Kudlowitz, David; Azour, Lea; Moore, William; Lui, Yvonne W; Aphinyanaphongs, Yindalon; Fernandez-Granda, Carlos; Geras, Krzysztof J

During the COVID-19 pandemic, rapid and accurate triage of patients at the emergency department is critical to inform decision-making. We propose a data-driven approach for automatic prediction of deterioration risk using a deep neural network that learns from chest X-ray images, and a gradient boosting model that learns from routine clinical variables. Our AI prognosis system, trained using data from 3,661 patients, achieves an AUC of 0.786 (95% CI: 0.742-0.827) when predicting deterioration within 96 hours. The deep neural network extracts informative areas of chest X-ray images to assist clinicians in interpreting the predictions, and performs comparably to two radiologists in a reader study. In order to verify performance in a real clinical setting, we silently deployed a preliminary version of the deep neural network at NYU Langone Health during the first wave of the pandemic, which produced accurate predictions in real-time. In summary, our findings demonstrate the potential of the proposed system for assisting front-line physicians in the triage of COVID-19 patients.

PMCID:7418753

PMID: 32793769

ISSN: 2331-8422

CID: 4556742

Nature medicine. 2019:25(9):1334-1336.DOI: 10.1038/s41591-019-0574-4

Augmented reality microscopes for cancer histopathology

Razavian, Narges

PMID: 31501608

ISSN: 1546-170x

CID: 4115362

PLoS one. 2019:14(4).DOI: 10.1371/journal.pone.0215571

Predicting childhood obesity using electronic health records and publicly available data

Hammond, Robert; Athanasiadou, Rodoniki; Curado, Silvia; Aphinyanaphongs, Yindalon; Abrams, Courtney; Messito, Mary Jo; Gross, Rachel; Katzow, Michelle; Jay, Melanie; Razavian, Narges; Elbel, Brian

BACKGROUND:Because of the strong link between childhood obesity and adulthood obesity comorbidities, and the difficulty in decreasing body mass index (BMI) later in life, effective strategies are needed to address this condition in early childhood. The ability to predict obesity before age five could be a useful tool, allowing prevention strategies to focus on high risk children. The few existing prediction models for obesity in childhood have primarily employed data from longitudinal cohort studies, relying on difficult to collect data that are not readily available to all practitioners. Instead, we utilized real-world unaugmented electronic health record (EHR) data from the first two years of life to predict obesity status at age five, an approach not yet taken in pediatric obesity research. METHODS AND FINDINGS/RESULTS:We trained a variety of machine learning algorithms to perform both binary classification and regression. Following previous studies demonstrating different obesity determinants for boys and girls, we similarly developed separate models for both groups. In each of the separate models for boys and girls we found that weight for length z-score, BMI between 19 and 24 months, and the last BMI measure recorded before age two were the most important features for prediction. The best performing models were able to predict obesity with an Area Under the Receiver Operator Characteristic Curve (AUC) of 81.7% for girls and 76.1% for boys. CONCLUSIONS:We were able to predict obesity at age five using EHR data with an AUC comparable to cohort-based studies, reducing the need for investment in additional data collection. Our results suggest that machine learning approaches for predicting future childhood obesity using EHR data could improve the ability of clinicians and researchers to drive future policy, intervention design, and the decision-making process in a clinical setting.

PMID: 31009509

ISSN: 1932-6203

CID: 3821342

Nature medicine. 2018:24(10):1559-1567.DOI: 10.1038/s41591-018-0177-5

Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning

Coudray, Nicolas; Ocampo, Paolo Santiago; Sakellaropoulos, Theodore; Narula, Navneet; Snuderl, Matija; Fenyö, David; Moreira, Andre L; Razavian, Narges; Tsirigos, Aristotelis

Visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and subtype of lung tumors. Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most prevalent subtypes of lung cancer, and their distinction requires visual inspection by an experienced pathologist. In this study, we trained a deep convolutional neural network (inception v3) on whole-slide images obtained from The Cancer Genome Atlas to accurately and automatically classify them into LUAD, LUSC or normal lung tissue. The performance of our method is comparable to that of pathologists, with an average area under the curve (AUC) of 0.97. Our model was validated on independent datasets of frozen tissues, formalin-fixed paraffin-embedded tissues and biopsies. Furthermore, we trained the network to predict the ten most commonly mutated genes in LUAD. We found that six of them-STK11, EGFR, FAT1, SETBP1, KRAS and TP53-can be predicted from pathology images, with AUCs from 0.733 to 0.856 as measured on a held-out population. These findings suggest that deep-learning models can assist pathologists in the detection of cancer subtype or gene mutations. Our approach can be applied to any cancer type, and the code is available at https://github.com/ncoudray/DeepPATH .

PMID: 30224757

ISSN: 1546-170x

CID: 3300392

Big data. 2015:3(4):277-87.DOI: 10.1089/big.2015.0020

Population-Level Prediction of Type 2 Diabetes From Claims Data and Analysis of Risk Factors

Razavian, Narges; Blecker, Saul; Schmidt, Ann Marie; Smith-McLallen, Aaron; Nigam, Somesh; Sontag, David

We present a new approach to population health, in which data-driven predictive models are learned for outcomes such as type 2 diabetes. Our approach enables risk assessment from readily available electronic claims data on large populations, without additional screening cost. Proposed model uncovers early and late-stage risk factors. Using administrative claims, pharmacy records, healthcare utilization, and laboratory results of 4.1 million individuals between 2005 and 2009, an initial set of 42,000 variables were derived that together describe the full health status and history of every individual. Machine learning was then used to methodically enhance predictive variable set and fit models predicting onset of type 2 diabetes in 2009-2011, 2010-2012, and 2011-2013. We compared the enhanced model with a parsimonious model consisting of known diabetes risk factors in a real-world environment, where missing values are common and prevalent. Furthermore, we analyzed novel and known risk factors emerging from the model at different age groups at different stages before the onset. Parsimonious model using 21 classic diabetes risk factors resulted in area under ROC curve (AUC) of 0.75 for diabetes prediction within a 2-year window following the baseline. The enhanced model increased the AUC to 0.80, with about 900 variables selected as predictive (p < 0.0001 for differences between AUCs). Similar improvements were observed for models predicting diabetes onset 1-3 years and 2-4 years after baseline. The enhanced model improved positive predictive value by at least 50% and identified novel surrogate risk factors for type 2 diabetes, such as chronic liver disease (odds ratio [OR] 3.71), high alanine aminotransferase (OR 2.26), esophageal reflux (OR 1.85), and history of acute bronchitis (OR 1.45). Liver risk factors emerge later in the process of diabetes development compared with obesity-related factors such as hypertension and high hemoglobin A1c. In conclusion, population-level risk prediction for type 2 diabetes using readily available administrative data is feasible and has better prediction performance than classical diabetes risk prediction algorithms on very large populations with missing data. The new model enables intervention allocation at national scale quickly and accurately and recovers potentially novel risk factors at different stages before the disease onset.

PMID: 27441408

ISSN: 2167-647x

CID: 2185492

Scientific reports. 2025:15(1).DOI: 10.1038/s41598-025-89607-8

Identification of patients at risk for pancreatic cancer in a 3-year timeframe based on machine learning algorithms

Zhu, Weicheng; Chen, Long; Aphinyanaphongs, Yindalon; Kastrinos, Fay; Simeone, Diane M; Pochapin, Mark; Stender, Cody; Razavian, Narges; Gonda, Tamas A

Early detection of pancreatic cancer (PC) remains challenging largely due to the low population incidence and few known risk factors. However, screening in at-risk populations and detection of early cancer has the potential to significantly alter survival. In this study, we aim to develop a predictive model to identify patients at risk for developing new-onset PC at two and a half to three year time frame. We used the Electronic Health Records (EHR) of a large medical system from 2000 to 2021 (N = 537,410). The EHR data analyzed in this work consists of patients' demographic information, diagnosis records, and lab values, which are used to identify patients who were diagnosed with pancreatic cancer and the risk factors used in the machine learning algorithm for prediction. We identified 73 risk factors of pancreatic cancer with the Phenome-wide Association Study (PheWAS) on a matched case-control cohort. Based on them, we built a large-scale machine learning algorithm based on EHR. A temporally stratified validation based on patients not included in any stage of the training of the model was performed. This model showed an AUROC at 0.742 [0.727, 0.757] which was similar in both the general population and in a subset of the population who has had prior cross-sectional imaging. The rate of diagnosis of pancreatic cancer in those in the top 1 percentile of the risk score was 6 folds higher than the general population. Our model leverages data extracted from a 6-month window of time in the electronic health record to identify patients at nearly sixfold higher than baseline risk of developing pancreatic cancer 2.5-3 years from evaluation. This approach offers an opportunity to define an enriched population entirely based on static data, where current screening may be recommended.

PMID: 40188106

ISSN: 2045-2322

CID: 5819542

Neurocritical care. 2025.DOI: 10.1007/s12028-025-02214-3

Predicting hematoma expansion after intracerebral hemorrhage: a comparison of clinician prediction with deep learning radiomics models

Yu, Boyang; Melmed, Kara R; Frontera, Jennifer; Zhu, Weicheng; Huang, Haoxu; Qureshi, Adnan I; Maggard, Abigail; Steinhof, Michael; Kuohn, Lindsey; Kumar, Arooshi; Berson, Elisa R; Tran, Anh T; Payabvash, Seyedmehdi; Ironside, Natasha; Brush, Benjamin; Dehkharghani, Seena; Razavian, Narges; Ranganath, Rajesh

BACKGROUND:Early prediction of hematoma expansion (HE) following nontraumatic intracerebral hemorrhage (ICH) may inform preemptive therapeutic interventions. We sought to identify how accurately machine learning (ML) radiomics models predict HE compared with expert clinicians using head computed tomography (HCT). METHODS:We used data from 900 study participants with ICH enrolled in the Antihypertensive Treatment of Acute Cerebral Hemorrhage 2 Study. ML models were developed using baseline HCT images, as well as admission clinical data in a training cohort (n = 621), and their performance was evaluated in an independent test cohort (n = 279) to predict HE (defined as HE by 33% or > 6 mL at 24 h). We simultaneously surveyed expert clinicians and asked them to predict HE using the same initial HCT images and clinical data. Area under the receiver operating characteristic curve (AUC) were compared between clinician predictions, ML models using radiomic data only (a random forest classifier and a deep learning imaging model) and ML models using both radiomic and clinical data (three random forest classifier models using different feature combinations). Kappa values comparing interrater reliability among expert clinicians were calculated. The best performing model was compared with clinical predication. RESULTS:The AUC for expert clinician prediction of HE was 0.591, with a kappa of 0.156 for interrater variability, compared with ML models using radiomic data only (a deep learning model using image input, AUC 0.680) and using both radiomic and clinical data (a random forest model, AUC 0.677). The intraclass correlation coefficient for clinical judgment and the best performing ML model was 0.47 (95% confidence interval 0.23-0.75). CONCLUSIONS:We introduced supervised ML algorithms demonstrating that HE prediction may outperform practicing clinicians. Despite overall moderate AUCs, our results set a new relative benchmark for performance in these tasks that even expert clinicians find challenging. These results emphasize the need for continued improvements and further enhanced clinical decision support to optimally manage patients with ICH.

PMID: 39920546

ISSN: 1556-0961

CID: 5784422

PLOS digital health. 2024:3(12).DOI: 10.1371/journal.pdig.0000685

Evaluating Large Language Models in extracting cognitive exam dates and scores

Zhang, Hao; Jethani, Neil; Jones, Simon; Genes, Nicholas; Major, Vincent J; Jaffe, Ian S; Cardillo, Anthony B; Heilenbach, Noah; Ali, Nadia Fazal; Bonanni, Luke J; Clayburn, Andrew J; Khera, Zain; Sadler, Erica C; Prasad, Jaideep; Schlacter, Jamie; Liu, Kevin; Silva, Benjamin; Montgomery, Sophie; Kim, Eric J; Lester, Jacob; Hill, Theodore M; Avoricani, Alba; Chervonski, Ethan; Davydov, James; Small, William; Chakravartty, Eesha; Grover, Himanshu; Dodson, John A; Brody, Abraham A; Aphinyanaphongs, Yindalon; Masurkar, Arjun; Razavian, Narges

Ensuring reliability of Large Language Models (LLMs) in clinical tasks is crucial. Our study assesses two state-of-the-art LLMs (ChatGPT and LlaMA-2) for extracting clinical information, focusing on cognitive tests like MMSE and CDR. Our data consisted of 135,307 clinical notes (Jan 12th, 2010 to May 24th, 2023) mentioning MMSE, CDR, or MoCA. After applying inclusion criteria 34,465 notes remained, of which 765 underwent ChatGPT (GPT-4) and LlaMA-2, and 22 experts reviewed the responses. ChatGPT successfully extracted MMSE and CDR instances with dates from 742 notes. We used 20 notes for fine-tuning and training the reviewers. The remaining 722 were assigned to reviewers, with 309 each assigned to two reviewers simultaneously. Inter-rater-agreement (Fleiss' Kappa), precision, recall, true/false negative rates, and accuracy were calculated. Our study follows TRIPOD reporting guidelines for model validation. For MMSE information extraction, ChatGPT (vs. LlaMA-2) achieved accuracy of 83% (vs. 66.4%), sensitivity of 89.7% (vs. 69.9%), true-negative rates of 96% (vs 60.0%), and precision of 82.7% (vs 62.2%). For CDR the results were lower overall, with accuracy of 87.1% (vs. 74.5%), sensitivity of 84.3% (vs. 39.7%), true-negative rates of 99.8% (98.4%), and precision of 48.3% (vs. 16.1%). We qualitatively evaluated the MMSE errors of ChatGPT and LlaMA-2 on double-reviewed notes. LlaMA-2 errors included 27 cases of total hallucination, 19 cases of reporting other scores instead of MMSE, 25 missed scores, and 23 cases of reporting only the wrong date. In comparison, ChatGPT's errors included only 3 cases of total hallucination, 17 cases of wrong test reported instead of MMSE, and 19 cases of reporting a wrong date. In this diagnostic/prognostic study of ChatGPT and LlaMA-2 for extracting cognitive exam dates and scores from clinical notes, ChatGPT exhibited high accuracy, with better performance compared to LlaMA-2. The use of LLMs could benefit dementia research and clinical care, by identifying eligible patients for treatments initialization or clinical trial enrollments. Rigorous evaluation of LLMs is crucial to understanding their capabilities and limitations.

PMCID:11634005

PMID: 39661652

ISSN: 2767-3170

CID: 5762692

[Zhong ji yi kan] = [Medicine for intermediate groups]. 2024.DOI: 10.1101/2024.04.26.24306180

Predicting Risk of Alzheimer's Diseases and Related Dementias with AI Foundation Model on Electronic Health Records

Zhu, Weicheng; Tang, Huanze; Zhang, Hao; Rajamohan, Haresh Rengaraj; Huang, Shih-Lun; Ma, Xinyue; Chaudhari, Ankush; Madaan, Divyam; Almahmoud, Elaf; Chopra, Sumit; Dodson, John A; Brody, Abraham A; Masurkar, Arjun V; Razavian, Narges

Early identification of Alzheimer's disease (AD) and AD-related dementias (ADRD) has high clinical significance, both because of the potential to slow decline through initiating FDA-approved therapies and managing modifiable risk factors, and to help persons living with dementia and their families to plan before cognitive loss makes doing so challenging. However, substantial racial and ethnic disparities in early diagnosis currently lead to additional inequities in care, urging accurate and inclusive risk assessment programs. In this study, we trained an artificial intelligence foundation model to represent the electronic health records (EHR) data with a vast cohort of 1.2 million patients within a large health system. Building upon this foundation EHR model, we developed a predictive Transformer model, named TRADE, capable of identifying risks for AD/ADRD and mild cognitive impairment (MCI), by analyzing the past sequential visit records. Amongst individuals 65 and older, our model was able to generate risk predictions for various future timeframes. On the held-out validation set, our model achieved an area under the receiver operating characteristic (AUROC) of 0.772 (95% CI: 0.770, 0.773) for identifying the AD/ADRD/MCI risks in 1 year, and AUROC of 0.735 (95% CI: 0.734, 0.736) in 5 years. The positive predictive values (PPV) in 5 years among individuals with top 1% and 5% highest estimated risks were 39.2% and 27.8%, respectively. These results demonstrate significant improvements upon the current EHR-based AD/ADRD/MCI risk assessment models, paving the way for better prognosis and management of AD/ADRD/MCI at scale.

PMCID:11071573

PMID: 38712223

CID: 5662732