Searched for: in-biosketch:yes
person:sbj2002
How Difference Tasks Are Affected by Probability Format, Part 2: A Making Numbers Meaningful Systematic Review
Benda, Natalie C; Zikmund-Fisher, Brian J; Sharma, Mohit M; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Ancker, Jessica S
UNLABELLED: HIGHLIGHTS/UNASSIGNED:Communicating relative risk differences as opposed to absolute risk differences, using numerator-only instead of part-to-whole graphics, and including anecdotes or information about others' decisions will all increase intentions to engage in a behavior.Relative risks (rather than absolute risk differences) and numerator-only graphics (rather than part-to-whole) will also increase felt and perceived effectiveness.To illustrate probability differences, people tend to prefer bar charts over icon arrays and graphics with labels over those without.All findings regarding the impact of different presentation formats for probability differences on trust produced insufficient evidence.
PMCID:11907595
PMID: 40094048
ISSN: 2381-4683
CID: 5813012
How Time-Trend Tasks Are Affected by Probability Format: A Making Numbers Meaningful Systematic Review
Sharma, Mohit M; Ancker, Jessica S; Benda, Natalie C; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Zikmund-Fisher, Brian J
UNLABELLED: HIGHLIGHTS/UNASSIGNED:This systematic review found that few studies of probability trend data compared similar formats or used comparable outcome measures.The only strong piece of evidence was that graphing probabilities over longer time periods such that the distance between curves widens will tend to increase the perceived difference between the curves.Weak evidence suggests that survival curves (versus mortality curves) may make it easier to identify the option with the highest overall survival.Weak evidence suggests that graphing probabilities over longer (rather than shorter) time periods may increase the ability to distinguish between small survival differences.Evidence was insufficient to determine whether any format influenced behaviors or behavioral intentions.
PMCID:11848886
PMID: 39995781
ISSN: 2381-4683
CID: 5800702
Large Language Model-Based Responses to Patients' In-Basket Messages
Small, William R; Wiesenfeld, Batia; Brandfield-Harvey, Beatrix; Jonassen, Zoe; Mandal, Soumik; Stevens, Elizabeth R; Major, Vincent J; Lostraglio, Erin; Szerencsy, Adam; Jones, Simon; Aphinyanaphongs, Yindalon; Johnson, Stephen B; Nov, Oded; Mann, Devin
IMPORTANCE/UNASSIGNED:Virtual patient-physician communications have increased since 2020 and negatively impacted primary care physician (PCP) well-being. Generative artificial intelligence (GenAI) drafts of patient messages could potentially reduce health care professional (HCP) workload and improve communication quality, but only if the drafts are considered useful. OBJECTIVES/UNASSIGNED:To assess PCPs' perceptions of GenAI drafts and to examine linguistic characteristics associated with equity and perceived empathy. DESIGN, SETTING, AND PARTICIPANTS/UNASSIGNED:This cross-sectional quality improvement study tested the hypothesis that PCPs' ratings of GenAI drafts (created using the electronic health record [EHR] standard prompts) would be equivalent to HCP-generated responses on 3 dimensions. The study was conducted at NYU Langone Health using private patient-HCP communications at 3 internal medicine practices piloting GenAI. EXPOSURES/UNASSIGNED:Randomly assigned patient messages coupled with either an HCP message or the draft GenAI response. MAIN OUTCOMES AND MEASURES/UNASSIGNED:PCPs rated responses' information content quality (eg, relevance), using a Likert scale, communication quality (eg, verbosity), using a Likert scale, and whether they would use the draft or start anew (usable vs unusable). Branching logic further probed for empathy, personalization, and professionalism of responses. Computational linguistics methods assessed content differences in HCP vs GenAI responses, focusing on equity and empathy. RESULTS/UNASSIGNED:A total of 16 PCPs (8 [50.0%] female) reviewed 344 messages (175 GenAI drafted; 169 HCP drafted). Both GenAI and HCP responses were rated favorably. GenAI responses were rated higher for communication style than HCP responses (mean [SD], 3.70 [1.15] vs 3.38 [1.20]; P = .01, U = 12 568.5) but were similar to HCPs on information content (mean [SD], 3.53 [1.26] vs 3.41 [1.27]; P = .37; U = 13 981.0) and usable draft proportion (mean [SD], 0.69 [0.48] vs 0.65 [0.47], P = .49, t = -0.6842). Usable GenAI responses were considered more empathetic than usable HCP responses (32 of 86 [37.2%] vs 13 of 79 [16.5%]; difference, 125.5%), possibly attributable to more subjective (mean [SD], 0.54 [0.16] vs 0.31 [0.23]; P < .001; difference, 74.2%) and positive (mean [SD] polarity, 0.21 [0.14] vs 0.13 [0.25]; P = .02; difference, 61.5%) language; they were also numerically longer (mean [SD] word count, 90.5 [32.0] vs 65.4 [62.6]; difference, 38.4%), but the difference was not statistically significant (P = .07) and more linguistically complex (mean [SD] score, 125.2 [47.8] vs 95.4 [58.8]; P = .002; difference, 31.2%). CONCLUSIONS/UNASSIGNED:In this cross-sectional study of PCP perceptions of an EHR-integrated GenAI chatbot, GenAI was found to communicate information better and with more empathy than HCPs, highlighting its potential to enhance patient-HCP communication. However, GenAI drafts were less readable than HCPs', a significant concern for patients with low health or English literacy.
PMCID:11252893
PMID: 39012633
ISSN: 2574-3805
CID: 5686582
Taxonomies for synthesizing the evidence on communicating numbers in health: Goals, format, and structure
Ancker, Jessica S; Benda, Natalie C; Sharma, Mohit M; Johnson, Stephen B; Weiner, Stephanie; Zikmund-Fisher, Brian J
Many people, especially those with low numeracy, are known to have difficulty interpreting and applying quantitative information to health decisions. These difficulties have resulted in a rich body of research about better ways to communicate numbers. Synthesizing this body of research into evidence-based guidance, however, is complicated by inconsistencies in research terminology and researcher goals. In this article, we introduce three taxonomies intended to systematize terminology in the literature, derived from an ongoing systematic literature review. The first taxonomy provides a systematic nomenclature for the outcome measures assessed in the studies, including perceptions, decisions, and actions. The second taxonomy is a nomenclature for the data formats assessed, including numbers (and different formats for numbers) and graphics. The third taxonomy describes the quantitative concepts being conveyed, from the simplest (a single value at a single point in time) to more complex ones (including a risk-benefit trade-off and a trend over time). Finally, we demonstrate how these three taxonomies can be used to resolve ambiguities and apparent contradictions in the literature.
PMID: 35007354
ISSN: 1539-6924
CID: 5118462
Identifying Patients With Hypoglycemia Using Natural Language Processing: Systematic Literature Review
Zheng, Yaguang; Dickson, Victoria Vaughan; Blecker, Saul; Ng, Jason M; Rice, Brynne Campbell; Melkus, Gail D'Eramo; Shenkar, Liat; Mortejo, Marie Claire R; Johnson, Stephen B
BACKGROUND:Accurately identifying patients with hypoglycemia is key to preventing adverse events and mortality. Natural language processing (NLP), a form of artificial intelligence, uses computational algorithms to extract information from text data. NLP is a scalable, efficient, and quick method to extract hypoglycemia-related information when using electronic health record data sources from a large population. OBJECTIVE:The objective of this systematic review was to synthesize the literature on the application of NLP to extract hypoglycemia from electronic health record clinical notes. METHODS:Literature searches were conducted electronically in PubMed, Web of Science Core Collection, CINAHL (EBSCO), PsycINFO (Ovid), IEEE Xplore, Google Scholar, and ACL Anthology. Keywords included hypoglycemia, low blood glucose, NLP, and machine learning. Inclusion criteria included studies that applied NLP to identify hypoglycemia, reported the outcomes related to hypoglycemia, and were published in English as full papers. RESULTS:This review (n=8 studies) revealed heterogeneity of the reported results related to hypoglycemia. Of the 8 included studies, 4 (50%) reported that the prevalence rate of any level of hypoglycemia was 3.4% to 46.2%. The use of NLP to analyze clinical notes improved the capture of undocumented or missed hypoglycemic events using International Classification of Diseases, Ninth Revision (ICD-9), and International Classification of Diseases, Tenth Revision (ICD-10), and laboratory testing. The combination of NLP and ICD-9 or ICD-10 codes significantly increased the identification of hypoglycemic events compared with individual methods; for example, the prevalence rates of hypoglycemia were 12.4% for International Classification of Diseases codes, 25.1% for an NLP algorithm, and 32.2% for combined algorithms. All the reviewed studies applied rule-based NLP algorithms to identify hypoglycemia. CONCLUSIONS:The findings provided evidence that the application of NLP to analyze clinical notes improved the capture of hypoglycemic events, particularly when combined with the ICD-9 or ICD-10 codes and laboratory testing.
PMCID:9152713
PMID: 35576579
ISSN: 2371-4379
CID: 5284202
Assessing adverse event reports of hysteroscopic sterilization device removal using natural language processing
Mao, Jialin; Sedrakyan, Art; Sun, Tianyi; Guiahi, Maryam; Chudnoff, Scott; Kinard, Madris; Johnson, Stephen B
OBJECTIVE:To develop an annotation model to apply natural language processing (NLP) to device adverse event reports and implement the model to evaluate the most frequently experienced events among women reporting a sterilization device removal. METHODS:score (a combined measure of PPV and sensitivity). Using extracted variables, we summarized the reporting source, the presence of prespecified and other patient and device events, additional sterilizations and other procedures performed, and time from implantation to removal. RESULTS:score was 91.5% for labeled items and 93.9% for distinct events after excluding duplicates. A total of 16 535 reports of device removal were analyzed. The most frequently reported patient and device events were abdominal/pelvic/genital pain (N = 13 166, 79.6%) and device dislocation/migration (N = 3180, 19.2%), respectively. Of those reporting an additional sterilization procedure, the majority had a hysterectomy or salpingectomy (N = 7932). One-fifth of the cases that had device removal timing specified reported a removal after 7 years following implantation (N = 2444/11 293). CONCLUSIONS:We present a roadmap to develop an annotation model for NLP to analyze device adverse event reports. The extracted information is informative and complements findings from previous research using administrative data.
PMID: 34919294
ISSN: 1099-1557
CID: 5109882
An architecture for research computing in health to support clinical and translational investigators with electronic patient data
Campion, Thomas R; Sholle, Evan T; Pathak, Jyotishman; Johnson, Stephen B; Leonard, John P; Cole, Curtis L
OBJECTIVE:Obtaining electronic patient data, especially from electronic health record (EHR) systems, for clinical and translational research is difficult. Multiple research informatics systems exist but navigating the numerous applications can be challenging for scientists. This article describes Architecture for Research Computing in Health (ARCH), our institution's approach for matching investigators with tools and services for obtaining electronic patient data. MATERIALS AND METHODS/METHODS:Supporting the spectrum of studies from populations to individuals, ARCH delivers a breadth of scientific functions-including but not limited to cohort discovery, electronic data capture, and multi-institutional data sharing-that manifest in specific systems-such as i2b2, REDCap, and PCORnet. Through a consultative process, ARCH staff align investigators with tools with respect to study design, data sources, and cost. Although most ARCH services are available free of charge, advanced engagements require fee for service. RESULTS:Since 2016 at Weill Cornell Medicine, ARCH has supported over 1200 unique investigators through more than 4177 consultations. Notably, ARCH infrastructure enabled critical coronavirus disease 2019 response activities for research and patient care. DISCUSSION/CONCLUSIONS:ARCH has provided a technical, regulatory, financial, and educational framework to support the biomedical research enterprise with electronic patient data. Collaboration among informaticians, biostatisticians, and clinicians has been critical to rapid generation and analysis of EHR data. CONCLUSION/CONCLUSIONS:A suite of tools and services, ARCH helps match investigators with informatics systems to reduce time to science. ARCH has facilitated research at Weill Cornell Medicine and may provide a model for informatics and research leaders to support scientists elsewhere.
PMID: 34850911
ISSN: 1527-974x
CID: 5065692
Identifying Patients with Hypoglycemia Using Natural Language Processing: A Systematic Literature Review [Meeting Abstract]
Zheng, Yaguang; Dickson, Victoria Vaughan; Blecker, Saul; Ng, Jason M.; Rice, Brynne Campbell; Shenkar, Liat; Mortejo, Marie Claire R.; Johnson, Stephen B.
ISI:000797631400085
ISSN: 0029-6562
CID: 5246702
The national landscape of culminating experiences in master's programs in health and biomedical informatics
Cox, Suzanne Morrison; Johnson, Stephen B; Shiu, Eva; Boren, Sue
Health and biomedical informatics graduate-level degree programs have proliferated across the United States in the last 10 years. To help inform programs on practices in teaching and learning, a survey of master's programs in health and biomedical informatics in the United States was conducted to determine the national landscape of culminating experiences including capstone projects, research theses, internships, and practicums. Almost all respondents reported that their programs required a culminating experience (97%). A paper (not a formal thesis), an oral presentation, a formal course, and an internship were required by ≥50% programs. The most commonly reported purposes for the culminating experience were to help students extend and apply the learning and as a bridge to the workplace. The biggest challenges were students' maturity, difficulty in synthesizing information into a coherent paper, and ability to generate research ideas. The results provide students and program leaders with a summary of pedagogical methods across programs.
PMCID:7973438
PMID: 33596593
ISSN: 1527-974x
CID: 4861922
ReCiter: An open source, identity-driven, authorship prediction algorithm optimized for academic institutions
Albert, Paul J; Dutta, Sarbajit; Lin, Jie; Zhu, Zimeng; Bales, Michael; Johnson, Stephen B; Mansour, Mohammad; Wright, Drew; Wheeler, Terrie R; Cole, Curtis L
Academic institutions need to maintain publication lists for thousands of faculty and other scholars. Automated tools are essential to minimize the need for direct feedback from the scholars themselves who are practically unable to commit necessary effort to keep the data accurate. In relying exclusively on clustering techniques, author disambiguation applications fail to satisfy key use cases of academic institutions. Algorithms can perfectly group together a set of publications authored by a common individual, but, for them to be useful to an academic institution, they need to programmatically and recurrently map articles to thousands of scholars of interest en masse. Consistent with a savvy librarian's approach for generating a scholar's list of publications, identity-driven authorship prediction is the process of using information about a scholar to quantify the likelihood that person wrote certain articles. ReCiter is an application that attempts to do exactly that. ReCiter uses institutionally-maintained identity data such as name of department and year of terminal degree to predict which articles a given scholar has authored. To compute the overall score for a given candidate article from PubMed (and, optionally, Scopus), ReCiter uses: up to 12 types of commonly available, identity data; whether other members of a cluster have been accepted or rejected by a user; and the average score of a cluster. In addition, ReCiter provides scoring and qualitative evidence supporting why particular articles are suggested. This context and confidence scoring allows curators to more accurately provide feedback on behalf of scholars. To help users to more efficiently curate publication lists, we used a support vector machine analysis to optimize the scoring of the ReCiter algorithm. In our analysis of a diverse test group of 500 scholars at an academic private medical center, ReCiter correctly predicted 98% of their publications in PubMed.
PMCID:8016248
PMID: 33793563
ISSN: 1932-6203
CID: 4862332