NYUHSL Faculty Bibliography

Searched for:

in-biosketch:true

person:oermae01

Total Results:

139

Neurosurgical focus. 2025:59(1).DOI: 10.3171/2025.4.FOCUS24674

Introduction. Artificial intelligence in neurosurgery: transforming a data-intensive specialty

Hopkins, Benjamin S; Sutherland, Garnette R; Browd, Samuel R; Donoho, Daniel A; Oermann, Eric K; Schirmer, Clemens M; Pennicooke, Brenton; Asaad, Wael F

PMID: 40591964

ISSN: 1092-0684

CID: 5887762

arXiv. 2025.DOI:

Large-Scale Multi-omic Biosequence Transformers for Modeling Protein-Nucleic Acid Interactions

Chen, Sully F; Steele, Robert J; Hocky, Glen M; Lemeneh, Beakal; Lad, Shivanand P; Oermann, Eric K

The transformer architecture has revolutionized bioinformatics and driven progress in the understanding and prediction of the properties of biomolecules. To date, most biosequence transformers have been trained on a single omic-either proteins or nucleic acids and have seen incredible success in downstream tasks in each domain with particularly noteworthy breakthroughs in protein structural modeling. However, single-omic pre-training limits the ability of these models to capture cross-modal interactions. Here we present OmniBioTE, the largest open-source multi-omic model trained on over 250 billion tokens of mixed protein and nucleic acid data. We show that despite only being trained on unlabelled sequence data, OmniBioTE learns joint representations consistent with the central dogma of molecular biology. We further demonstrate that OmbiBioTE achieves state-of-the-art results predicting the change in Gibbs free energy (∆G) of the binding interaction between a given nucleic acid and protein. Remarkably, we show that multi-omic biosequence transformers emergently learn useful structural information without any a priori structural training, allowing us to predict which protein residues are most involved in the protein-nucleic acid binding interaction. Lastly, compared to single-omic controls trained with identical compute, OmniBioTE demonstrates superior performance-per-FLOP and absolute accuracy across both multi-omic and single-omic benchmarks, highlighting the power of a unified modeling approach for biological sequences.

PMCID:11998858

PMID: 40236839

ISSN: 2331-8422

CID: 5883432

Operative neurosurgery (Hagerstown, Md.). 2025.DOI: 10.1227/ons.0000000000001646

Intraoperative Evaluation of Dural Arteriovenous Fistula Obliteration Using FLOW 800 Hemodynamic Analysis

Sangwon, Karl L; Grin, Eric A; Negash, Bruck; Wiggan, Daniel D; Lapierre, Cathryn; Raz, Eytan; Shapiro, Maksim; Laufer, Ilya; Sharashidze, Vera; Rutledge, Caleb; Riina, Howard A; Oermann, Eric K; Nossek, Erez

BACKGROUND AND OBJECTIVES/OBJECTIVE:Dural arteriovenous fistula (dAVF) surgery is a microsurgical procedure that requires confirmation of obliteration using formal cerebral angiography, but the lack of intraoperative angiogram or need for postoperative angiogram in some settings necessitates a search for alternative, less invasive methods to verify surgical success. This study evaluates the use of indocyanine green videoangiography FLOW 800 hemodynamic intraoperatively during cranial and spinal dAVF obliteration to confirm obliteration and predict surgical success. METHODS:A retrospective analysis was conducted using indocyanine green videoangiography FLOW 800 to intraoperatively measure 4 hemodynamic parameters-Delay Time, Speed, Time to Peak, and Rise Time-across venous drainage regions of interest pre/post-dAVF obliteration. Univariate and multivariate statistical analyses to evaluate and visualize presurgical vs postsurgical state hemodynamic changes included nonparametric statistical tests, logistic regression, and Bayesian analysis. RESULTS:A total of 14 venous drainage regions of interest from 8 patients who had successful spinal or cranial dAVF obliteration confirmed with intraoperative digital subtraction angiography were extracted. Significant hemodynamic changes were observed after dAVF obliteration, with median Speed decreasing from 13.5 to 5.5 s-1 (P = .029) and Delay Time increasing from 2.07 to 7.86 s (P = .020). Bayesian logistic regression identified Delay Time as the strongest predictor of postsurgical state, with a 50% increase associated with 2.16 times higher odds of achieving obliteration (odds ratio = 4.59, 95% highest density interval: 1.07-19.95). Speed exhibited a trend toward a negative association with postsurgical state (odds ratio = 0.62, 95% highest density interval: 0.26-1.42). Receiver operating characteristic-area under the curve analysis using logistic regression demonstrated a score of 0.760, highlighting Delay Time and Speed as key features distinguishing preobliteration and postobliteration states. CONCLUSION/CONCLUSIONS:Our findings demonstrate that intraoperative FLOW 800 analysis reliably quantifies and visualizes immediate hemodynamic changes consistent with dAVF obliteration. Speed and Delay Time emerged as key indicators of surgical success, highlighting the potential of FLOW 800 as a noninvasive adjunct to traditional imaging techniques for confirming dAVF obliteration intraoperatively.

PMID: 40434390

ISSN: 2332-4260

CID: 5855352

Journal of neuro-oncology. 2025.DOI: 10.1007/s11060-025-05026-9

Outcomes of concurrent versus non-concurrent immune checkpoint inhibition with stereotactic radiosurgery for melanoma brain metastases

Fu, Allen Ye; Bernstein, Kenneth; Zhang, Jeff; Silverman, Joshua; Mehnert, Janice; Sulman, Erik P; Oermann, Eric Karl; Kondziolka, Douglas

PURPOSE/OBJECTIVE:Immune checkpoint inhibition (ICI) has revolutionized the treatment of melanoma care. Stereotactic radiosurgery combined with ICI has shown promise to improve clinical outcomes in prior studies in patients who have metastatic melanoma with brain metastases. However, others have suggested that concurrent ICI with stereotactic radiosurgery can increase the risk of complications. METHODS:We present a retrospective, single-institution analysis of 98 patients with a median follow up of 17.1 months managed with immune checkpoint inhibition and stereotactic radiosurgery concurrently and non-concurrently. A total of 55 patients were included in the concurrent group and 43 patients in the non-concurrent treatment group. Cox proportional hazards models were used to assess the relation between concurrent or non-concurrent treatment and overall survival or local progression-free survival. The Wald test was used to assess significance. Significant differences between patients in both groups experiencing adverse events including adverse radiation effects, perilesional edema, and neurological deficits were tested for using the Chi-square or Fisher's exact test. RESULTS:Patients receiving concurrent versus non-concurrent ICI showed a significant increase in overall survival (median 37.1 months, 95% CI: 18.9 months - NA versus median 11.4 months, 95% CI: 6.4-33.2 months, p = 0.0056) but not local progression-free survival. There were no significant differences between groups with regards to adverse radiation effects (2% versus 3%), perilesional edema (20% versus 9%), neurological deficits (3% versus 20%). CONCLUSION/CONCLUSIONS:These results suggest that the timing of ICI does not increase risk of neurological complications when delivered within 4 weeks of SRS.

PMID: 40183901

ISSN: 1573-7373

CID: 5819412

Cell reports. Medicine. 2025.DOI: 10.1016/j.xcrm.2025.102056

MetaGP: A generative foundation model integrating electronic health records and multimodal imaging for addressing unmet clinical needs

Liu, Fei; Zhou, Hongyu; Wang, Kai; Yu, Yunfang; Gao, Yuanxu; Sun, Zhuo; Liu, Sian; Sun, Shanshan; Zou, Zixing; Li, Zhuomin; Li, Bingzhou; Miao, Hanpei; Liu, Yang; Hou, Taiwa; Fok, Manson; Patil, Nivritti Gajanan; Xue, Kanmin; Li, Ting; Oermann, Eric; Yin, Yun; Duan, Lian; Qu, Jia; Huang, Xiaoying; Jin, Shengwei; Zhang, Kang

Artificial intelligence makes strides in specialized diagnostics but faces challenges in complex clinical scenarios, such as rare disease diagnosis and emergency condition identification. To address these limitations, we develop Meta General Practitioner (MetaGP), a 32-billion-parameter generative foundation model trained on extensive datasets, including over 8 million electronic health records, biomedical literature, and medical textbooks. MetaGP demonstrates robust diagnostic capabilities, achieving accuracy comparable to experienced clinicians. In rare disease cases, it achieves an average diagnostic score of 1.57, surpassing GPT-4's 0.93. For emergency conditions, it improves diagnostic accuracy for junior and mid-level clinicians by 53% and 46%, respectively. MetaGP also excels in generating medical imaging reports, producing high-quality outputs for chest X-rays and computed tomography, often rated comparable to or superior to physician-authored reports. These findings highlight MetaGP's potential to transform clinical decision-making across diverse medical contexts.

PMID: 40187356

ISSN: 2666-3791

CID: 5819502

Transplantation. 2025:109(3):399-402.DOI: 10.1097/TP.0000000000005261

Trials and Tribulations: Responses of ChatGPT to Patient Questions About Kidney Transplantation

Xu, Jingzhi; Mankowski, Michal; Vanterpool, Karen B; Strauss, Alexandra T; Lonze, Bonnie E; Orandi, Babak J; Stewart, Darren; Bae, Sunjae; Ali, Nicole; Stern, Jeffrey; Mattoo, Aprajita; Robalino, Ryan; Soomro, Irfana; Weldon, Elaina; Oermann, Eric K; Aphinyanaphongs, Yin; Sidoti, Carolyn; McAdams-DeMarco, Mara; Massie, Allan B; Gentry, Sommer E; Segev, Dorry L; Levan, Macey L

PMID: 39477825

ISSN: 1534-6080

CID: 5747132

Neurosurgery. 2025:96(2):233-234.DOI: 10.1227/neu.0000000000003305

Is It Really "Artificial" Intelligence?

Kondziolka, Douglas; Oermann, Eric K

PMID: 39812480

ISSN: 1524-4040

CID: 5883422

Nature medicine. 2025:31(2):609-617.DOI: 10.1038/s41591-024-03359-y

Self-improving generative foundation model for synthetic medical image generation and clinical applications

Wang, Jinzhuo; Wang, Kai; Yu, Yunfang; Lu, Yuxing; Xiao, Wenchao; Sun, Zhuo; Liu, Fei; Zou, Zixing; Gao, Yuanxu; Yang, Lei; Zhou, Hong-Yu; Miao, Hanpei; Zhao, Wenting; Huang, Lisha; Zeng, Lingchao; Guo, Rui; Chong, Ieng; Deng, Boyu; Cheng, Linling; Chen, Xiaoniao; Luo, Jing; Zhu, Meng-Hua; Baptista-Hon, Daniel; Monteiro, Olivia; Li, Ming; Ke, Yu; Li, Jiahui; Zeng, Simiao; Guan, Taihua; Zeng, Jin; Xue, Kanmin; Oermann, Eric; Luo, Huiyan; Yin, Yun; Zhang, Kang; Qu, Jia

In many clinical and research settings, the scarcity of high-quality medical imaging datasets has hampered the potential of artificial intelligence (AI) clinical applications. This issue is particularly pronounced in less common conditions, underrepresented populations and emerging imaging modalities, where the availability of diverse and comprehensive datasets is often inadequate. To address this challenge, we introduce a unified medical image-text generative model called MINIM that is capable of synthesizing medical images of various organs across various imaging modalities based on textual instructions. Clinician evaluations and rigorous objective measurements validate the high quality of MINIM's synthetic images. MINIM exhibits an enhanced generative capability when presented with previously unseen data domains, demonstrating its potential as a generalist medical AI (GMAI). Our findings show that MINIM's synthetic images effectively augment existing datasets, boosting performance across multiple medical applications such as diagnostics, report generation and self-supervised learning. On average, MINIM enhances performance by 12% for ophthalmic, 15% for chest, 13% for brain and 17% for breast-related tasks. Furthermore, we demonstrate MINIM's potential clinical utility in the accurate prediction of HER2-positive breast cancer from MRI images. Using a large retrospective simulation analysis, we demonstrate MINIM's clinical potential by accurately identifying targeted therapy-sensitive EGFR mutations using lung cancer computed tomography images, which could potentially lead to improved 5-year survival rates. Although these results are promising, further validation and refinement in more diverse and prospective settings would greatly enhance the model's generalizability and robustness.

PMID: 39663467

ISSN: 1546-170x

CID: 5762792

Nature medicine. 2025:31(2):618-626.DOI: 10.1038/s41591-024-03445-1

Medical large language models are vulnerable to data-poisoning attacks

Alber, Daniel Alexander; Yang, Zihao; Alyakin, Anton; Yang, Eunice; Rai, Sumedha; Valliani, Aly A; Zhang, Jeff; Rosenbaum, Gabriel R; Amend-Thomas, Ashley K; Kurland, David B; Kremer, Caroline M; Eremiev, Alexander; Negash, Bruck; Wiggan, Daniel D; Nakatsuka, Michelle A; Sangwon, Karl L; Neifert, Sean N; Khan, Hammad A; Save, Akshay Vinod; Palla, Adhith; Grin, Eric A; Hedman, Monika; Nasir-Moin, Mustafa; Liu, Xujin Chris; Jiang, Lavender Yao; Mankowski, Michal A; Segev, Dorry L; Aphinyanaphongs, Yindalon; Riina, Howard A; Golfinos, John G; Orringer, Daniel A; Kondziolka, Douglas; Oermann, Eric Karl

The adoption of large language models (LLMs) in healthcare demands a careful analysis of their potential to spread false medical knowledge. Because LLMs ingest massive volumes of data from the open Internet during training, they are potentially exposed to unverified medical knowledge that may include deliberately planted misinformation. Here, we perform a threat assessment that simulates a data-poisoning attack against The Pile, a popular dataset used for LLM development. We find that replacement of just 0.001% of training tokens with medical misinformation results in harmful models more likely to propagate medical errors. Furthermore, we discover that corrupted models match the performance of their corruption-free counterparts on open-source benchmarks routinely used to evaluate medical LLMs. Using biomedical knowledge graphs to screen medical LLM outputs, we propose a harm mitigation strategy that captures 91.9% of harmful content (F1 = 85.7%). Our algorithm provides a unique method to validate stochastically generated LLM outputs against hard-coded relationships in knowledge graphs. In view of current calls for improved data provenance and transparent LLM development, we hope to raise awareness of emergent risks from LLMs trained indiscriminately on web-scraped data, particularly in healthcare where misinformation can potentially compromise patient safety.

PMID: 39779928

ISSN: 1546-170x

CID: 5782182

Neurosurgery. 2024.DOI: 10.1227/neu.0000000000003297

CNS-CLIP: Transforming a Neurosurgical Journal Into a Multimodal Medical Model

Alyakin, Anton; Kurland, David; Alber, Daniel Alexander; Sangwon, Karl L; Li, Danxun; Tsirigos, Aristotelis; Leuthardt, Eric; Kondziolka, Douglas; Oermann, Eric Karl

BACKGROUND AND OBJECTIVES/OBJECTIVE:Classical biomedical data science models are trained on a single modality and aimed at one specific task. However, the exponential increase in the size and capabilities of the foundation models inside and outside medicine shows a shift toward task-agnostic models using large-scale, often internet-based, data. Recent research into smaller foundation models trained on specific literature, such as programming textbooks, demonstrated that they can display capabilities similar to or superior to large generalist models, suggesting a potential middle ground between small task-specific and large foundation models. This study attempts to introduce a domain-specific multimodal model, Congress of Neurological Surgeons (CNS)-Contrastive Language-Image Pretraining (CLIP), developed for neurosurgical applications, leveraging data exclusively from Neurosurgery Publications. METHODS:We constructed a multimodal data set of articles from Neurosurgery Publications through PDF data collection and figure-caption extraction using an artificial intelligence pipeline for quality control. Our final data set included 24 021 figure-caption pairs. We then developed a fine-tuning protocol for the OpenAI CLIP model. The model was evaluated on tasks including neurosurgical information retrieval, computed tomography imaging classification, and zero-shot ImageNet classification. RESULTS:CNS-CLIP demonstrated superior performance in neurosurgical information retrieval with a Top-1 accuracy of 24.56%, compared with 8.61% for the baseline. The average area under receiver operating characteristic across 6 neuroradiology tasks achieved by CNS-CLIP was 0.95, slightly superior to OpenAI's Contrastive Language-Image Pretraining at 0.94 and significantly outperforming a vanilla vision transformer at 0.62. In generalist classification, CNS-CLIP reached a Top-1 accuracy of 47.55%, a decrease from the baseline of 52.37%, demonstrating a catastrophic forgetting phenomenon. CONCLUSION/CONCLUSIONS:This study presents a pioneering effort in building a domain-specific multimodal model using data from a medical society publication. The results indicate that domain-specific models, while less globally versatile, can offer advantages in specialized contexts. This emphasizes the importance of using tailored data and domain-focused development in training foundation models in neurosurgery and general medicine.

PMID: 39636129

ISSN: 1524-4040

CID: 5780182