NYUHSL Faculty Bibliography

Searched for:

in-biosketch:true

person:genesn01

Total Results:

Journal of the American Medical Informatics Association. 2024:31(9):1983-1993.DOI: 10.1093/jamia/ocae117

Evaluation of GPT-4 ability to identify and generate patient instructions for actionable incidental radiology findings

Woo, Kar-Mun C; Simon, Gregory W; Akindutire, Olumide; Aphinyanaphongs, Yindalon; Austrian, Jonathan S; Kim, Jung G; Genes, Nicholas; Goldenring, Jacob A; Major, Vincent J; Pariente, ChloÃ© S; Pineda, Edwin G; Kang, Stella K

OBJECTIVES/OBJECTIVE:To evaluate the proficiency of a HIPAA-compliant version of GPT-4 in identifying actionable, incidental findings from unstructured radiology reports of Emergency Department patients. To assess appropriateness of artificial intelligence (AI)-generated, patient-facing summaries of these findings. MATERIALS AND METHODS/METHODS:Radiology reports extracted from the electronic health record of a large academic medical center were manually reviewed to identify non-emergent, incidental findings with high likelihood of requiring follow-up, further sub-stratified as "definitely actionable" (DA) or "possibly actionable-clinical correlation" (PA-CC). Instruction prompts to GPT-4 were developed and iteratively optimized using a validation set of 50 reports. The optimized prompt was then applied to a test set of 430 unseen reports. GPT-4 performance was primarily graded on accuracy identifying either DA or PA-CC findings, then secondarily for DA findings alone. Outputs were reviewed for hallucinations. AI-generated patient-facing summaries were assessed for appropriateness via Likert scale. RESULTS:For the primary outcome (DA or PA-CC), GPT-4 achieved 99.3% recall, 73.6% precision, and 84.5% F-1. For the secondary outcome (DA only), GPT-4 demonstrated 95.2% recall, 77.3% precision, and 85.3% F-1. No findings were "hallucinated" outright. However, 2.8% of cases included generated text about recommendations that were inferred without specific reference. The majority of True Positive AI-generated summaries required no or minor revision. CONCLUSION/CONCLUSIONS:GPT-4 demonstrates proficiency in detecting actionable, incidental findings after refined instruction prompting. AI-generated patient instructions were most often appropriate, but rarely included inferred recommendations. While this technology shows promise to augment diagnostics, active clinician oversight via "human-in-the-loop" workflows remains critical for clinical implementation.

PMID: 38778578

ISSN: 1527-974x

CID: 5654832

Journal of the American Geriatrics Society. 2024:72(7):2184-2194.DOI: 10.1111/jgs.18746

Scaling the EQUIPPED medication safety program: Traditional and hub-and-spoke implementation models

Vandenberg, Ann E; Hwang, Ula; Das, Shamie; Genes, Nicholas; Nyamu, Sylviah; Richardson, Lynne; Ezenkwele, Ugo; Legome, Eric; Richardson, Christopher; Belachew, Adam; Leong, Traci; Kegler, Michelle; Vaughan, Camille P

BACKGROUND:The EQUIPPED (Enhancing Quality of Prescribing Practices for Older Adults Discharged from the Emergency Department) medication safety program is an evidence-informed quality improvement initiative to reduce potentially inappropriate medications (PIMs) prescribed by Emergency Department (ED) providers to adults aged 65 and older at discharge. We aimed to scale-up this successful program using (1) a traditional implementation model at an ED with a novel electronic medical record and (2) a new hub-and-spoke implementation model at three new EDs within a health system that had previously implemented EQUIPPED (hub). We hypothesized that implementation speed would increase under the hub-and-spoke model without cost to PIM reduction or site engagement. METHODS:We evaluated the effect of the EQUIPPED program on PIMs for each ED, comparing their 12-month baseline to 12-month post-implementation period prescribing data, number of months to implement EQUIPPED, and facilitators and barriers to implementation. RESULTS:The proportion of PIMs at all four sites declined significantly from pre- to post-EQUIPPED: at traditional site 1 from 8.9% (8.1-9.6) to 3.6% (3.6-9.6) (p < 0.001); at spread site 1 from 12.2% (11.2-13.2) to 7.1% (6.1-8.1) (p < 0.001); at spread site 2 from 11.3% (10.1-12.6) to 7.9% (6.4-8.8) (p = 0.045); and at spread site 3 from 16.2% (14.9-17.4) to 11.7% (10.3-13.0) (p < 0.001). Time to implement was equivalent at all sites across both models. Interview data, reflecting a wide scope of responsibilities for the champion at the traditional site and a narrow scope at the spoke sites, indicated disproportionate barriers to engagement at the spoke sites. CONCLUSIONS:EQUIPPED was successfully implemented under both implementation models at four new sites during the COVID-19 pandemic, indicating the feasibility of adapting EQUIPPED to complex, real-world conditions. The hub-and-spoke model offers an effective way to scale-up EQUIPPED though a speed or quality advantage could not be shown.

PMID: 38259070

ISSN: 1532-5415

CID: 5624832

Annals of emergency medicine. 2024:83(5):467-476.DOI: 10.1016/j.annemergmed.2023.12.014

The Clinical Emergency Data Registry: Structure, Use, and Limitations for Research

Lin, Michelle P; Sharma, Dhruv; Venkatesh, Arjun; Epstein, Stephen K; Janke, Alexander; Genes, Nicholas; Mehrotra, Abhi; Augustine, James; Malcolm, Bill; Goyal, Pawan; Griffey, Richard T

The Clinical Emergency Data Registry (CEDR) is a qualified clinical data registry that collects data from participating emergency departments (EDs) in the United States for quality measurement, improvement, and reporting purposes. This article aims to provide an overview of the data collection and validation process, describe the existing data structure and elements, and explain the potential opportunities and limitations for ongoing and future research use. CEDR data are primarily collected for quality reporting purposes and are obtained from diverse sources, including electronic health records and billing data that are de-identified and stored in a secure, centralized database. The CEDR data structure is organized around clinical episodes, which contain multiple data elements that are standardized using common data elements and are mapped to established terminologies to enable interoperability and data sharing. The data elements include patient demographics, clinical characteristics, diagnostic and treatment procedures, and outcomes. Key limitations include the limited generalizability due to the selective nature of participating EDs and the limited validation and completeness of data elements not currently used for quality reporting purposes, including demographic data. Nonetheless, CEDR holds great potential for ongoing and future research in emergency medicine due to its large-volume, longitudinal, near real-time, clinical data. In 2021, the American College of Emergency Physicians authorized the transition from CEDR to the Emergency Medicine Data Institute, which will catalyze investments in improved data quality and completeness for research to advance emergency care.

PMID: 38276937

ISSN: 1097-6760

CID: 5625412

[Zhong ji yi kan] = [Medicine for intermediate groups]. 2024.DOI: 10.1101/2023.07.10.23292373

Evaluating Large Language Models in Extracting Cognitive Exam Dates and Scores

Zhang, Hao; Jethani, Neil; Jones, Simon; Genes, Nicholas; Major, Vincent J; Jaffe, Ian S; Cardillo, Anthony B; Heilenbach, Noah; Ali, Nadia Fazal; Bonanni, Luke J; Clayburn, Andrew J; Khera, Zain; Sadler, Erica C; Prasad, Jaideep; Schlacter, Jamie; Liu, Kevin; Silva, Benjamin; Montgomery, Sophie; Kim, Eric J; Lester, Jacob; Hill, Theodore M; Avoricani, Alba; Chervonski, Ethan; Davydov, James; Small, William; Chakravartty, Eesha; Grover, Himanshu; Dodson, John A; Brody, Abraham A; Aphinyanaphongs, Yindalon; Masurkar, Arjun; Razavian, Narges

IMPORTANCE/UNASSIGNED:Large language models (LLMs) are crucial for medical tasks. Ensuring their reliability is vital to avoid false results. Our study assesses two state-of-the-art LLMs (ChatGPT and LlaMA-2) for extracting clinical information, focusing on cognitive tests like MMSE and CDR. OBJECTIVE/UNASSIGNED:Evaluate ChatGPT and LlaMA-2 performance in extracting MMSE and CDR scores, including their associated dates. METHODS/UNASSIGNED:Our data consisted of 135,307 clinical notes (Jan 12th, 2010 to May 24th, 2023) mentioning MMSE, CDR, or MoCA. After applying inclusion criteria 34,465 notes remained, of which 765 underwent ChatGPT (GPT-4) and LlaMA-2, and 22 experts reviewed the responses. ChatGPT successfully extracted MMSE and CDR instances with dates from 742 notes. We used 20 notes for fine-tuning and training the reviewers. The remaining 722 were assigned to reviewers, with 309 each assigned to two reviewers simultaneously. Inter-rater-agreement (Fleiss' Kappa), precision, recall, true/false negative rates, and accuracy were calculated. Our study follows TRIPOD reporting guidelines for model validation. RESULTS/UNASSIGNED:For MMSE information extraction, ChatGPT (vs. LlaMA-2) achieved accuracy of 83% (vs. 66.4%), sensitivity of 89.7% (vs. 69.9%), true-negative rates of 96% (vs 60.0%), and precision of 82.7% (vs 62.2%). For CDR the results were lower overall, with accuracy of 87.1% (vs. 74.5%), sensitivity of 84.3% (vs. 39.7%), true-negative rates of 99.8% (98.4%), and precision of 48.3% (vs. 16.1%). We qualitatively evaluated the MMSE errors of ChatGPT and LlaMA-2 on double-reviewed notes. LlaMA-2 errors included 27 cases of total hallucination, 19 cases of reporting other scores instead of MMSE, 25 missed scores, and 23 cases of reporting only the wrong date. In comparison, ChatGPT's errors included only 3 cases of total hallucination, 17 cases of wrong test reported instead of MMSE, and 19 cases of reporting a wrong date. CONCLUSIONS/UNASSIGNED:In this diagnostic/prognostic study of ChatGPT and LlaMA-2 for extracting cognitive exam dates and scores from clinical notes, ChatGPT exhibited high accuracy, with better performance compared to LlaMA-2. The use of LLMs could benefit dementia research and clinical care, by identifying eligible patients for treatments initialization or clinical trial enrollments. Rigorous evaluation of LLMs is crucial to understanding their capabilities and limitations.

PMCID:10888985

PMID: 38405784

CID: 5722422

Applied clinical informatics. 2024:15(1):155-163.DOI: 10.1055/a-2237-8309

Structure and Funding of Clinical Informatics Fellowships: A National Survey of Program Directors

Patel, Tushar N; Chaise, Aaron J; Hanna, John J; Patel, Kunal P; Kochendorfer, Karl M; Medford, Richard J; Mize, Dara E; Melnick, Edward R; Hron, Jonathan D; Youens, Kenneth; Pandita, Deepti; Leu, Michael G; Ator, Gregory A; Yu, Feliciano; Genes, Nicholas; Baker, Carrie K; Bell, Douglas S; Pevnick, Joshua M; Conrad, Steven A; Chandawarkar, Aarti R; Rogers, Kendall M; Kaelber, David C; Singh, Ila R; Levy, Bruce P; Finnell, John T; Kannry, Joseph; Pageler, Natalie M; Mohan, Vishnu; Lehmann, Christoph U

BACKGROUND:In 2011, the American Board of Medical Specialties established clinical informatics (CI) as a subspecialty in medicine, jointly administered by the American Board of Pathology and the American Board of Preventive Medicine. Subsequently, many institutions created CI fellowship training programs to meet the growing need for informaticists. Although many programs share similar features, there is considerable variation in program funding and administrative structures. OBJECTIVES:The aim of our study was to characterize CI fellowship program features, including governance structures, funding sources, and expenses. METHODS:We created a cross-sectional online REDCap survey with 44 items requesting information on program administration, fellows, administrative support, funding sources, and expenses. We surveyed program directors of programs accredited by the Accreditation Council for Graduate Medical Education between 2014 and 2021. RESULTS:We invited 54 program directors, of which 41 (76%) completed the survey. The average administrative support received was $27,732/year. Most programs (85.4%) were accredited to have two or more fellows per year. Programs were administratively housed under six departments: Internal Medicine (17; 41.5%), Pediatrics (7; 17.1%), Pathology (6; 14.6%), Family Medicine (6; 14.6%), Emergency Medicine (4; 9.8%), and Anesthesiology (1; 2.4%). Funding sources for CI fellowship program directors included: hospital or health systems (28.3%), clinical departments (28.3%), graduate medical education office (13.2%), biomedical informatics department (9.4%), hospital information technology (9.4%), research and grants (7.5%), and other sources (3.8%) that included philanthropy and external entities. CONCLUSION:CI fellowships have been established in leading academic and community health care systems across the country. Due to their unique training requirements, these programs require significant resources for education, administration, and recruitment. There continues to be considerable heterogeneity in funding models between programs. Our survey findings reinforce the need for reformed federal funding models for informatics practice and training.

PMCID:10881258

PMID: 38171383

ISSN: 1869-0327

CID: 5633772

JAMA network open. 2023:6(12).DOI: 10.1001/jamanetworkopen.2023.49136

Electronic Health Record Messaging Patterns of Health Care Professionals in Inpatient Medicine

Small, William; Iturrate, Eduardo; Austrian, Jonathan; Genes, Nicholas

PMID: 38147337

ISSN: 2574-3805

CID: 5623492

Clinical practice & cases in emergency medicine. 2023:7(4):210-214.DOI: 10.5811/cpcem.1259

Mpox in the Emergency Department: A Case Series

Musharbash, Michael; DiLorenzo, Madeline; Genes, Nicholas; Mukherjee, Vikramjit; Klinger, Amanda

INTRODUCTION/UNASSIGNED:We sought to describe the demographic characteristics, clinical features, and outcomes of a cohort of patients who presented to our emergency departments with mpox (formerly known as monkeypox) infection between May 1-August 1, 2022. CASE SERIES/UNASSIGNED:We identified 145 patients tested for mpox, of whom 79 were positive. All positive cases were among cisgender men, and the majority (92%) were among men who have sex with men. A large number of patients (39%) were human immunodeficiency virus (HIV) positive. There was wide variation in emergency department (ED) length of stay (range 2-16 hours, median 4 hours) and test turnaround time (range 1-11 days, median 4 days). Most patients (95%) were discharged, although a substantial proportion (22%) had a return visit within 30 days, and 28% ultimately received tecrovirimat. CONCLUSION/UNASSIGNED:Patients who presented to our ED with mpox had similar demographic characteristics and clinical features as those described in other clinical settings during the 2022 outbreak. While there were operational challenges to the evaluation and management of these patients, demonstrated by variable lengths of stay and frequent return visits, most were able to be discharged.

PMCID:10855293

PMID: 38353186

ISSN: 2474-252x

CID: 5635742

Applied clinical informatics. 2023:14(5):951-960.DOI: 10.1055/s-0043-1776404

A Systematic Approach to the Design and Implementation of Clinical Informatics Fellowship Programs

Lingham, Veena; Chandwarkar, Aarti; Miller, Michael; Baker, Carrie; Genes, Nicholas; Hellems, Martha; Khanna, Raman; Mize, Dara; Silverman, Howard

Clinical Informatics (CI), a medical subspecialty since 2011, has grown from the initial four fellowship programs accredited by the Accreditation Council for Graduate Medical Education (ACGME) in 2014 to more than 50 and counting in the present day. In parallel, the literature guiding Clinical Informatics Fellowship training and the curriculum evolved from the original core content published in 2009 to the more recent CI Subspecialty Delineation of Practice and the updated ACGME Milestones 2.0 for CI. In this paper, we outline this evolution and its impact on CIF Curricula. We then propose a framework, specific processes, and tools to standardize the design and optimize the implementation of CIF programs.

PMCID:10700146

PMID: 38057262

ISSN: 1869-0327

CID: 5589712

American journal of emergency medicine. 2023:68:22-27.DOI: 10.1016/j.ajem.2023.02.020

Incidence of rescue surgical airways after attempted orotracheal intubation in the emergency department: A National Emergency Airway Registry (NEAR) Study

Offenbacher, Joseph; Nikolla, Dhimitri A; Carlson, Jestin N; Smith, Silas W; Genes, Nicholas; Boatright, Dowin H; Brown, Calvin A

BACKGROUND:Cricothyrotomy is a critical technique for rescue of the failed airway in the emergency department (ED). Since the adoption of video laryngoscopy, the incidence of rescue surgical airways (those performed after at least one unsuccessful orotracheal or nasotracheal intubation attempt), and the circumstances where they are attempted, has not been characterized. OBJECTIVE:We report the incidence and indications for rescue surgical airways using a multicenter observational registry. METHODS:We performed a retrospective analysis of rescue surgical airways in subjects ≥14 years of age. We describe patient, clinician, airway management, and outcome variables. RESULTS:Of 19,071 subjects in NEAR, 17,720 (92.9%) were ≥14 years old with at least one initial orotracheal or nasotracheal intubation attempt, 49 received a rescue surgical airway attempt, an incidence of 2.8 cases per 1000 (0.28% [95% confidence interval 0.21 to 0.37]). The median number of airway attempts prior to rescue surgical airways was 2 (interquartile range 1, 2). Twenty-five were in trauma victims (51.0% [36.5 to 65.4]), with neck trauma being the most common traumatic indication (n = 7, 14.3% [6.4 to 27.9]). CONCLUSION:Rescue surgical airways occurred infrequently in the ED (0.28% [0.21 to 0.37]), with approximately half performed due to a trauma indication. These results may have implications for surgical airway skill acquisition, maintenance, and experience.

PMID: 36905882

ISSN: 1532-8171

CID: 5542042

Journal of medical ethics. 2023:49(3):156-159.DOI: 10.1136/medethics-2021-107759

Patient portal access for caregivers of adult and geriatric patients: reframing the ethics of digital patient communication

Ganta, Teja; Appel, Jacob M; Genes, Nicholas

Patient portals are poised to transform health communication by empowering patients with rapid access to their own health data. The 21st Century Cures Act is a US federal law that, among other provisions, prevents health entities from engaging in practices that disrupt the exchange of electronic health information-a measure that may increase the usage of patient health portals. Caregiver access to patient portals, however, may lead to breaches in patient privacy and confidentiality if not managed properly through proxy accounts. We present an ethical framework that guides policy and clinical workflow development for healthcare institutions to support the best use of patient portals. Caregivers are vital members of the care team and should be supported through novel forms of health information technology (IT). Patients, however, may not want all information to be shared with their proxies so healthcare institutions must support the development and use of separate proxy accounts as opposed to using the patient's own account as well provide controls for limiting the scope of information displayed in the proxy accounts. Lastly, as socioeconomic barriers to adoption of health IT persist, healthcare providers must work to ensure multiple streams of patient communication, to prevent further propagating health inequities.

PMID: 35437282

ISSN: 1473-4257

CID: 5218212