Searched for: in-biosketch:yes
person:joness22
Evaluating Hospital Course Summarization by an Electronic Health Record-Based Large Language Model
Small, William R; Austrian, Jonathan; O'Donnell, Luke; Burk-Rafel, Jesse; Hochman, Katherine A; Goodman, Adam; Zaretsky, Jonah; Martin, Jacob; Johnson, Stephen; Major, Vincent J; Jones, Simon; Henke, Christian; Verplanke, Benjamin; Osso, Jwan; Larson, Ian; Saxena, Archana; Mednick, Aron; Simonis, Choumika; Han, Joseph; Kesari, Ravi; Wu, Xinyuan; Heery, Lauren; Desel, Tenzin; Baskharoun, Samuel; Figman, Noah; Farooq, Umar; Shah, Kunal; Jahan, Nusrat; Kim, Jeong Min; Testa, Paul; Feldman, Jonah
IMPORTANCE/UNASSIGNED:Hospital course (HC) summarization represents an increasingly onerous discharge summary component for physicians. Literature supports large language models (LLMs) for HC summarization, but whether physicians can effectively partner with electronic health record-embedded LLMs to draft HCs is unknown. OBJECTIVES/UNASSIGNED:To compare the editing effort required by time-constrained resident physicians to improve LLM- vs physician-generated HCs toward a novel 4Cs (complete, concise, cohesive, and confabulation-free) HC. DESIGN, SETTING, AND PARTICIPANTS/UNASSIGNED:Quality improvement study using a convenience sample of 10 internal medicine resident editors, 8 hospitalist evaluators, and randomly selected general medicine admissions in December 2023 lasting 4 to 8 days at New York University Langone Health. EXPOSURES/UNASSIGNED:Residents and hospitalists reviewed randomly assigned patient medical records for 10 minutes. Residents blinded to author type who edited each HC pair (physician and LLM) for quality in 3 minutes, followed by comparative ratings by attending hospitalists. MAIN OUTCOMES AND MEASURES/UNASSIGNED:Editing effort was quantified by analyzing the edits that occurred on the HC pairs after controlling for length (percentage edited) and the degree to which the original HCs' meaning was altered (semantic change). Hospitalists compared edited HC pairs with A/B testing on the 4Cs (5-point Likert scales converted to 10-point bidirectional scales). RESULTS/UNASSIGNED:Among 100 admissions, compared with physician HCs, residents edited a smaller percentage of LLM HCs (LLM mean [SD], 31.5% [16.6%] vs physicians, 44.8% [20.0%]; P < .001). Additionally, LLM HCs required less semantic change (LLM mean [SD], 2.4% [1.6%] vs physicians, 4.9% [3.5%]; P < .001). Attending physicians deemed LLM HCs to be more complete (mean [SD] difference LLM vs physicians on 10-point bidirectional scale, 3.00 [5.28]; P < .001), similarly concise (mean [SD], -1.02 [6.08]; P = .20), and cohesive (mean [SD], 0.70 [6.14]; P = .60), but with more confabulations (mean [SD], -0.98 [3.53]; P = .002). The composite scores were similar (mean [SD] difference LLM vs physician on 40-point bidirectional scale, 1.70 [14.24]; P = .46). CONCLUSIONS AND RELEVANCE/UNASSIGNED:Electronic health record-embedded LLM HCs required less editing than physician-generated HCs to approach a quality standard, resulting in HCs that were comparably or more complete, concise, and cohesive, but contained more confabulations. Despite the potential influence of artificial time constraints, this study supports the feasibility of a physician-LLM partnership for writing HCs and provides a basis for monitoring LLM HCs in clinical practice.
PMID: 40802185
ISSN: 2574-3805
CID: 5906762
Patient portal messaging to address delayed follow-up for uncontrolled diabetes: a pragmatic, randomised clinical trial
Nagler, Arielle R; Horwitz, Leora Idit; Ahmed, Aamina; Mukhopadhyay, Amrita; Dapkins, Isaac; King, William; Jones, Simon A; Szerencsy, Adam; Pulgarin, Claudia; Gray, Jennifer; Mei, Tony; Blecker, Saul
IMPORTANCE/OBJECTIVE:Patients with poor glycaemic control have a high risk for major cardiovascular events. Improving glycaemic monitoring in patients with diabetes can improve morbidity and mortality. OBJECTIVE:To assess the effectiveness of a patient portal message in prompting patients with poorly controlled diabetes without a recent glycated haemoglobin (HbA1c) result to have their HbA1c repeated. DESIGN/METHODS:A pragmatic, randomised clinical trial. SETTING/METHODS:A large academic health system consisting of over 350 ambulatory practices. PARTICIPANTS/METHODS:Patients who had an HbA1c greater than 10% who had not had a repeat HbA1c in the prior 6 months. EXPOSURES/METHODS:A single electronic health record (EHR)-based patient portal message to prompt patients to have a repeat HbA1c test versus usual care. MAIN OUTCOMES/RESULTS:The primary outcome was a follow-up HbA1c test result within 90 days of randomisation. RESULTS:The study included 2573 patients with a mean (SD) HbA1c of 11.2%. Among 1317 patients in the intervention group, 24.2% had follow-up HbA1c tests completed within 90 days, versus 21.1% of 1256 patients in the control group (p=0.07). Patients in the intervention group were more likely to log into the patient portal within 60 days as compared with the control group (61.2% vs 52.3%, p<0.001). CONCLUSIONS:Among patients with poorly controlled diabetes and no recent HbA1c result, a brief patient portal message did not significantly increase follow-up testing but did increase patient engagement with the patient portal. Automated patient messages could be considered as a part of multipronged efforts to involve patients in their diabetes care.
PMID: 40348403
ISSN: 2044-5423
CID: 5843792
Tracking inflammation status for improving patient prognosis: A review of current methods, unmet clinical needs and opportunities
Raju, Vidya; Reddy, Revanth; Javan, Arzhang Cyrus; Hajihossainlou, Behnam; Weissleder, Ralph; Guiseppi-Elie, Anthony; Kurabayashi, Katsuo; Jones, Simon A; Faghih, Rose T
Inflammation is the body's response to infection, trauma or injury and is activated in a coordinated fashion to ensure the restoration of tissue homeostasis and healthy physiology. This process requires communication between stromal cells resident to the tissue compartment and infiltrating immune cells which is dysregulated in disease. Clinical innovations in patient diagnosis and stratification include measures of inflammatory activation that support the assessment of patient prognosis and response to therapy. We propose that (i) the recent advances in fast, dynamic monitoring of inflammatory markers (e.g., cytokines) and (ii) data-dependent theoretical and computational modeling of inflammatory marker dynamics will enable the quantification of the inflammatory response, identification of optimal, disease-specific biomarkers and the design of personalized interventions to improve patient outcomes - multidisciplinary efforts in which biomedical engineers may potentially contribute. To illustrate these ideas, we describe the actions of cytokines, acute phase proteins and hormones in the inflammatory response and discuss their role in local wounds, COVID-19, cancer, autoimmune diseases, neurodegenerative diseases and aging, with a central focus on cardiac surgery. We also discuss the challenges and opportunities involved in tracking and modulating inflammation in clinical settings.
PMID: 40324661
ISSN: 1873-1899
CID: 5855652
Clinical Decision Support Leveraging Health Information Exchange improves Concordance with Patient's Resuscitation Orders and End-Of-Life Wishes
Chakravartty, Eesha; Silberlust, Jared; Blecker, Saul; Zhao, Yunan; Alendy, Fariza; Menzer, Heather; Ahmed, Aamina; Jones, Simon; Ferrauiola, Meg; Austrian, Jonathan Saul
Objectives Improve concordance between patient end-of-life preferences and code status orders by incorporating data from a state registry with Clinical Decision Support (CDS) within the electronic health record (EHR) to preserve patient autonomy and ensure that patients receive care that aligns with their wishes. Methods Leveraging a Health Information exchange (HIE) interface between the New York State Medical Orders for Life-Sustaining Treatment (eMOLST) registry and the EHR of our academic health system, we developed a bundled CDS intervention that displays eMOLST information at the time of code status ordering and provides an in-line alert when providers enter a resuscitation order discordant with wishes documented in the eMOLST registry. To evaluate this intervention, we performed a segmented regression analysis of an interrupted times series to compare percentage of discordant orders before and after implementation among all hospitalizations for which an eMOLST was available. Results We identified a total of 3648 visits that had an eMOLST filed prior to inpatient admission and a code status order placed during admission. There was a statistically significant decrease of discordant resuscitation orders of -5.95% after the intervention went live, with a relative risk reduction of 25%, [95% CI: -9.95%, -1.94%, p=0.009] in the pre- and post-intervention period. Logistic regression model after adjusting for co-variates showed an average marginal effect of -5.12% after the intervention [CI =-9.75%, -0.50%, p=0.03]. Conclusions Our intervention resulted in a decrease in discordant resuscitation orders. This study demonstrates that accessibility to eMOLST data within the provider workflow supported by CDS can reduce discrepancies between patient end-of-life wishes and hospital code status orders.
PMID: 40267976
ISSN: 1869-0327
CID: 5830322
A descriptive analysis of nurses' self-reported mental health symptoms during the COVID-19 pandemic: An international study
Squires, Allison; Dutton, Hillary J; Casales-Hernandez, Maria Guadalupe; Rodriguez López, Javier Isidro; Jimenez-Sanchez, Juana; Saldarriaga-Dixon, Paola; Bernal Cespedes, Cornelia; Flores, Yesenia; Arteaga Cordova, Maryuri Ibeth; Castillo, Gabriela; Loza Sosa, Jannette Marga; Garcia, Julio; Ramirez, Taycia; González-Nahuelquin, Cibeles; Amaya, Teresa; Guedes Dos Santos, Jose Luis; Muñoz Rojas, Derby; Buitrago-Malaver, Lilia Andrea; Rojas-Pineda, Fiorella Jackeline; Alvarez Watson, Jose Luis; Gómez Del Pulgar, Mercedes; Anyorikeya, Maria; Bilgin, Hulya; Blaževičienė, Aurelija; Buranda, Lucky Sarjono; Castillo, Theresa P; Cedeño Tapia, Stefanía Johanna; Chiappinotto, Stefania; Damiran, Dulamsuren; Duka, Blerina; Ejupi, Vlora; Ismail, Mohamed Jama; Khatun, Shanzida; Koy, Virya; Lee, Seung Eun; Lee, Taewha; Lickiewicz, Jakub; Macijauskienė, Jūratė; Malinowska-Lipien, Iwona; Nantsupawat, Apiradee; Nashwan, Abdulqadir J; Ahmed, Fadumo Osman; Ozakgul, Aylin; Paarima, Yennuten; Palese, Alvisa; Ramirez, Veronica E; Tsuladze, Alisa; Tulek, Zeliha; Uchaneishvili, Maia; Wekem Kukeba, Margaret; Yanjmaa, Enkhjargal; Patel, Honey; Ma, Zhongyue; Goldsamt, Lloyd A; Jones, Simon
AIM/OBJECTIVE:To describe the self-reported mental health of nurses from 35 countries who worked during the COVID-19 pandemic. BACKGROUND:There is little occupationally specific data about nurses' mental health worldwide. Studies have documented the impact on nurses' mental health of the COVID-19 pandemic, but few have baseline referents. METHODS:A descriptive, cross-sectional design structured the study. Data reflect a convenience sample of 9,387 participants who completed the opt-in survey between July 31, 2022, and October 31, 2023. Descriptive statistics were run to analyze the following variables associated with mental health: Self-reports of mental health symptoms, burnout, personal losses during the pandemic, access to mental health services, and self-care practices used to cope with pandemic-related stressors. Reporting of this study was steered by the STROBE guideline for quantitative studies. RESULTS:Anxiety or depression occurred at rates ranging from 23%-61%, with country-specific trends in reporting observed. Approximately 18% of the sample reported experiencing some symptoms of burnout. The majority of nurses' employers did not provide mental health support in the workplace. Most reported more frequently engaging with self-care practices compared with before the pandemic. Notably, 20% of nurses suffered the loss of a family member, 35% lost a friend, and 34% a coworker due to COVID-19. Nearly half (48%) reported experiencing public aggression due to their identity as a nurse. CONCLUSIONS:The data obtained establish a basis for understanding the specific mental health needs of the nursing workforce globally, highlighting key areas for service development. IMPLICATIONS FOR NURSING POLICY/CONCLUSIONS:Healthcare organizations and governmental bodies need to develop targeted mental health support programs that are readily accessible to nurses to foster a resilient nursing workforce.
PMID: 39871528
ISSN: 1466-7657
CID: 5780662
Using Interpersonal Continuity of Care in Home Health Physical Therapy to Reduce Hospital Readmissions
Engel, Patrick; Vorensky, Mark; Squires, Allison; Jones, Simon
This paper is an examination of the relationship between continuity of care with home health physical therapists following hospitalization and the likelihood of readmission. We conducted a retrospective cohort study. Using rehospitalization as the dependent variable, a continuity of care indicator variable was analyzed with a multivariable logistic regression. The indicator variable was created using the Bice-Boxerman Index to measure physical therapist continuity of care. The mean of the index (0.81) was used to separate between high continuity (0.81 or greater) of care and low continuity of care (lower than 0.81). The sample included 90,220 patients, with data coming from the linking of the Outcome Assessment and Information Set (OASIS) and an administrative dataset. All subjects lived in the NYC metro area. Inclusion criteria was a patient's admission to their first home health care site following discharge occurring between 2010 and 2015, and individuals who identified as Male or Female. In comparison to low continuity of physical therapy, high continuity of physical therapy significantly decreased hospital readmissions (OR = 0.74, 95% CI 0.71-0.76, p ≤ .001, AME = -4.28%). Interpersonal continuity of physical therapy care has been identified as a key factor in decreasing readmissions from the home care setting. The research suggests an increased emphasis in preserving physical therapist continuity following hospitalization should be explored, with the potential to reduce hospital readmissions.
PMCID:12293198
PMID: 40718154
ISSN: 1084-8223
CID: 5903042
Evaluating Large Language Models in extracting cognitive exam dates and scores
Zhang, Hao; Jethani, Neil; Jones, Simon; Genes, Nicholas; Major, Vincent J; Jaffe, Ian S; Cardillo, Anthony B; Heilenbach, Noah; Ali, Nadia Fazal; Bonanni, Luke J; Clayburn, Andrew J; Khera, Zain; Sadler, Erica C; Prasad, Jaideep; Schlacter, Jamie; Liu, Kevin; Silva, Benjamin; Montgomery, Sophie; Kim, Eric J; Lester, Jacob; Hill, Theodore M; Avoricani, Alba; Chervonski, Ethan; Davydov, James; Small, William; Chakravartty, Eesha; Grover, Himanshu; Dodson, John A; Brody, Abraham A; Aphinyanaphongs, Yindalon; Masurkar, Arjun; Razavian, Narges
Ensuring reliability of Large Language Models (LLMs) in clinical tasks is crucial. Our study assesses two state-of-the-art LLMs (ChatGPT and LlaMA-2) for extracting clinical information, focusing on cognitive tests like MMSE and CDR. Our data consisted of 135,307 clinical notes (Jan 12th, 2010 to May 24th, 2023) mentioning MMSE, CDR, or MoCA. After applying inclusion criteria 34,465 notes remained, of which 765 underwent ChatGPT (GPT-4) and LlaMA-2, and 22 experts reviewed the responses. ChatGPT successfully extracted MMSE and CDR instances with dates from 742 notes. We used 20 notes for fine-tuning and training the reviewers. The remaining 722 were assigned to reviewers, with 309 each assigned to two reviewers simultaneously. Inter-rater-agreement (Fleiss' Kappa), precision, recall, true/false negative rates, and accuracy were calculated. Our study follows TRIPOD reporting guidelines for model validation. For MMSE information extraction, ChatGPT (vs. LlaMA-2) achieved accuracy of 83% (vs. 66.4%), sensitivity of 89.7% (vs. 69.9%), true-negative rates of 96% (vs 60.0%), and precision of 82.7% (vs 62.2%). For CDR the results were lower overall, with accuracy of 87.1% (vs. 74.5%), sensitivity of 84.3% (vs. 39.7%), true-negative rates of 99.8% (98.4%), and precision of 48.3% (vs. 16.1%). We qualitatively evaluated the MMSE errors of ChatGPT and LlaMA-2 on double-reviewed notes. LlaMA-2 errors included 27 cases of total hallucination, 19 cases of reporting other scores instead of MMSE, 25 missed scores, and 23 cases of reporting only the wrong date. In comparison, ChatGPT's errors included only 3 cases of total hallucination, 17 cases of wrong test reported instead of MMSE, and 19 cases of reporting a wrong date. In this diagnostic/prognostic study of ChatGPT and LlaMA-2 for extracting cognitive exam dates and scores from clinical notes, ChatGPT exhibited high accuracy, with better performance compared to LlaMA-2. The use of LLMs could benefit dementia research and clinical care, by identifying eligible patients for treatments initialization or clinical trial enrollments. Rigorous evaluation of LLMs is crucial to understanding their capabilities and limitations.
PMCID:11634005
PMID: 39661652
ISSN: 2767-3170
CID: 5762692
Quality of care after a horizontal merger between two large academic hospitals
Wissink, Ilse J A; Schinkel, Michiel; Peters-Sengers, Hessel; Jones, Simon A; Vlaar, Alexander P J; Kruijthof, Karen J; Wiersinga, W Joost
BACKGROUND/UNASSIGNED:Hospital mergers remain common, but their influence on healthcare quality varies. Data on effects of European hospital mergers are ill defined, and academic hospitals in particular. This case study assesses early quality of care changes in two formerly competing Dutch academic hospitals that merged on June 6, 2018. METHODS/UNASSIGNED:Statistical process control and interrupted time series analysis were performed. All adult, non-psychiatric patients, admitted between 01-03-2016 and 01-10-2022 were eligible for analysis. Primary outcome measure was all cause in-hospital mortality (or hospice), secondary outcomes were unplanned 30-day readmissions to same hospital, length of stay, and patients' hospital rating. Data were obtained from electronic health records, and patient experience surveys. FINDINGS/UNASSIGNED:The mean (SD) age of the 573 813 included patients was 54·3 (18·9) years. The minority was female (277 817, 48·4 %), and most admissions were acute (308 597, 53·8 %). No merger related change in mortality was found in the first 20 months post-merger (limited to the pre-Covid-19 era). For this same period, the 30-day readmission incidence changed to a downward slope post-merger, and the length of stay shortened (immediate level-change -3·796 % (95 % CI, -5·776 % to -1·816 %) and trend-change -0·150 % per month (95 % CI, -0·307 % to 0·007 %)). Patients' hospital ratings seemed to improve post-merger. INTERPRETATION/UNASSIGNED:In this quality improvement study, a full- and gradual post-merger integration strategy for a Dutch academic hospital merger was not associated with changes in in-hospital mortality, and yielded slight improved results for secondary quality of care outcomes.
PMCID:11490856
PMID: 39430517
ISSN: 2405-8440
CID: 5739502
Development and evaluation of an artificial intelligence-based workflow for the prioritization of patient portal messages
Yang, Jie; So, Jonathan; Zhang, Hao; Jones, Simon; Connolly, Denise M; Golding, Claudia; Griffes, Esmelin; Szerencsy, Adam C; Wu, Tzer Jason; Aphinyanaphongs, Yindalon; Major, Vincent J
OBJECTIVES/UNASSIGNED:Accelerating demand for patient messaging has impacted the practice of many providers. Messages are not recommended for urgent medical issues, but some do require rapid attention. This presents an opportunity for artificial intelligence (AI) methods to prioritize review of messages. Our study aimed to highlight some patient portal messages for prioritized review using a custom AI system integrated into the electronic health record (EHR). MATERIALS AND METHODS/UNASSIGNED:We developed a Bidirectional Encoder Representations from Transformers (BERT)-based large language model using 40 132 patient-sent messages to identify patterns involving high acuity topics that warrant an immediate callback. The model was then implemented into 2 shared pools of patient messages managed by dozens of registered nurses. A primary outcome, such as the time before messages were read, was evaluated with a difference-in-difference methodology. RESULTS/UNASSIGNED: = 396 466), an improvement exceeding the trend was observed in the time high-scoring messages sit unread (21 minutes, 63 vs 42 for messages sent outside business hours). DISCUSSION/UNASSIGNED:Our work shows great promise in improving care when AI is aligned with human workflow. Future work involves audience expansion, aiding users with suggested actions, and drafting responses. CONCLUSION/UNASSIGNED:Many patients utilize patient portal messages, and while most messages are routine, a small fraction describe alarming symptoms. Our AI-based workflow shortens the turnaround time to get a trained clinician to review these messages to provide safer, higher-quality care.
PMCID:11328532
PMID: 39156046
ISSN: 2574-2531
CID: 5680362
Large Language Model-Based Responses to Patients' In-Basket Messages
Small, William R; Wiesenfeld, Batia; Brandfield-Harvey, Beatrix; Jonassen, Zoe; Mandal, Soumik; Stevens, Elizabeth R; Major, Vincent J; Lostraglio, Erin; Szerencsy, Adam; Jones, Simon; Aphinyanaphongs, Yindalon; Johnson, Stephen B; Nov, Oded; Mann, Devin
IMPORTANCE/UNASSIGNED:Virtual patient-physician communications have increased since 2020 and negatively impacted primary care physician (PCP) well-being. Generative artificial intelligence (GenAI) drafts of patient messages could potentially reduce health care professional (HCP) workload and improve communication quality, but only if the drafts are considered useful. OBJECTIVES/UNASSIGNED:To assess PCPs' perceptions of GenAI drafts and to examine linguistic characteristics associated with equity and perceived empathy. DESIGN, SETTING, AND PARTICIPANTS/UNASSIGNED:This cross-sectional quality improvement study tested the hypothesis that PCPs' ratings of GenAI drafts (created using the electronic health record [EHR] standard prompts) would be equivalent to HCP-generated responses on 3 dimensions. The study was conducted at NYU Langone Health using private patient-HCP communications at 3 internal medicine practices piloting GenAI. EXPOSURES/UNASSIGNED:Randomly assigned patient messages coupled with either an HCP message or the draft GenAI response. MAIN OUTCOMES AND MEASURES/UNASSIGNED:PCPs rated responses' information content quality (eg, relevance), using a Likert scale, communication quality (eg, verbosity), using a Likert scale, and whether they would use the draft or start anew (usable vs unusable). Branching logic further probed for empathy, personalization, and professionalism of responses. Computational linguistics methods assessed content differences in HCP vs GenAI responses, focusing on equity and empathy. RESULTS/UNASSIGNED:A total of 16 PCPs (8 [50.0%] female) reviewed 344 messages (175 GenAI drafted; 169 HCP drafted). Both GenAI and HCP responses were rated favorably. GenAI responses were rated higher for communication style than HCP responses (mean [SD], 3.70 [1.15] vs 3.38 [1.20]; P = .01, U = 12 568.5) but were similar to HCPs on information content (mean [SD], 3.53 [1.26] vs 3.41 [1.27]; P = .37; U = 13 981.0) and usable draft proportion (mean [SD], 0.69 [0.48] vs 0.65 [0.47], P = .49, t = -0.6842). Usable GenAI responses were considered more empathetic than usable HCP responses (32 of 86 [37.2%] vs 13 of 79 [16.5%]; difference, 125.5%), possibly attributable to more subjective (mean [SD], 0.54 [0.16] vs 0.31 [0.23]; P < .001; difference, 74.2%) and positive (mean [SD] polarity, 0.21 [0.14] vs 0.13 [0.25]; P = .02; difference, 61.5%) language; they were also numerically longer (mean [SD] word count, 90.5 [32.0] vs 65.4 [62.6]; difference, 38.4%), but the difference was not statistically significant (P = .07) and more linguistically complex (mean [SD] score, 125.2 [47.8] vs 95.4 [58.8]; P = .002; difference, 31.2%). CONCLUSIONS/UNASSIGNED:In this cross-sectional study of PCP perceptions of an EHR-integrated GenAI chatbot, GenAI was found to communicate information better and with more empathy than HCPs, highlighting its potential to enhance patient-HCP communication. However, GenAI drafts were less readable than HCPs', a significant concern for patients with low health or English literacy.
PMCID:11252893
PMID: 39012633
ISSN: 2574-3805
CID: 5686582