Searched for: in-biosketch:yes
person:sbj2002
Natural Language Processing for Automated Extraction of Continuous Glucose Monitoring Data
Zheng, Yaguang; Song, Yulin; Iturrate, Eduardo; Wu, Bei; Zweig, Susan; Johnson, Stephen B
OBJECTIVE:Continuous glucose monitoring (CGM) is essential in diabetes care and research; however, extracting key data (e.g., time above, in, or below range) from CGM reports is manual, time-consuming, and inefficient. Natural language processing (NLP) can extract data from unstructured sources (e.g., images), but its application in CGM remains unexplored. We aimed to evaluate the accuracy of extracting CGM data using NLP. RESEARCH DESIGN AND METHODS/METHODS:We analyzed CGM reports stored as PDF files from the electronic health record at New York University Langone Health. The steps of our algorithm pipeline consist of 1) performing optical character recognition (OCR) to obtain glucose matrix data from CGM reports, 2) determining the type of CGM documents based on keywords in OCR results, 3) extracting variables of glucose based on CGM document type, and 4) storing the extracted glucose data in a structured database. Two experts with experience in CGM research and clinical practice conducted an independent manual review of 1% of the documents (n = 226). We calculated accuracy (correct extraction of CGM data) by comparing the algorithm's results with the manual review. RESULTS:Of the documents analyzed, 36.8% were Freestyle Libre and 63.2% were Dexcom. For information extraction, the agreement in evaluating Libre results between two experts was 99.93%. When comparing algorithm accuracy with manual review, the accuracy for Libre was 99.87% and, for Dexcom, 100.00%. CONCLUSIONS:Using an NLP approach to extract valuable glucose data from CGM PDF files is feasible and accurate, which can benefit clinical practice and diabetes research.
PMID: 41166562
ISSN: 1935-5548
CID: 5961562
Evaluating Hospital Course Summarization by an Electronic Health Record-Based Large Language Model
Small, William R; Austrian, Jonathan; O'Donnell, Luke; Burk-Rafel, Jesse; Hochman, Katherine A; Goodman, Adam; Zaretsky, Jonah; Martin, Jacob; Johnson, Stephen; Major, Vincent J; Jones, Simon; Henke, Christian; Verplanke, Benjamin; Osso, Jwan; Larson, Ian; Saxena, Archana; Mednick, Aron; Simonis, Choumika; Han, Joseph; Kesari, Ravi; Wu, Xinyuan; Heery, Lauren; Desel, Tenzin; Baskharoun, Samuel; Figman, Noah; Farooq, Umar; Shah, Kunal; Jahan, Nusrat; Kim, Jeong Min; Testa, Paul; Feldman, Jonah
IMPORTANCE/UNASSIGNED:Hospital course (HC) summarization represents an increasingly onerous discharge summary component for physicians. Literature supports large language models (LLMs) for HC summarization, but whether physicians can effectively partner with electronic health record-embedded LLMs to draft HCs is unknown. OBJECTIVES/UNASSIGNED:To compare the editing effort required by time-constrained resident physicians to improve LLM- vs physician-generated HCs toward a novel 4Cs (complete, concise, cohesive, and confabulation-free) HC. DESIGN, SETTING, AND PARTICIPANTS/UNASSIGNED:Quality improvement study using a convenience sample of 10 internal medicine resident editors, 8 hospitalist evaluators, and randomly selected general medicine admissions in December 2023 lasting 4 to 8 days at New York University Langone Health. EXPOSURES/UNASSIGNED:Residents and hospitalists reviewed randomly assigned patient medical records for 10 minutes. Residents blinded to author type who edited each HC pair (physician and LLM) for quality in 3 minutes, followed by comparative ratings by attending hospitalists. MAIN OUTCOMES AND MEASURES/UNASSIGNED:Editing effort was quantified by analyzing the edits that occurred on the HC pairs after controlling for length (percentage edited) and the degree to which the original HCs' meaning was altered (semantic change). Hospitalists compared edited HC pairs with A/B testing on the 4Cs (5-point Likert scales converted to 10-point bidirectional scales). RESULTS/UNASSIGNED:Among 100 admissions, compared with physician HCs, residents edited a smaller percentage of LLM HCs (LLM mean [SD], 31.5% [16.6%] vs physicians, 44.8% [20.0%]; P < .001). Additionally, LLM HCs required less semantic change (LLM mean [SD], 2.4% [1.6%] vs physicians, 4.9% [3.5%]; P < .001). Attending physicians deemed LLM HCs to be more complete (mean [SD] difference LLM vs physicians on 10-point bidirectional scale, 3.00 [5.28]; P < .001), similarly concise (mean [SD], -1.02 [6.08]; P = .20), and cohesive (mean [SD], 0.70 [6.14]; P = .60), but with more confabulations (mean [SD], -0.98 [3.53]; P = .002). The composite scores were similar (mean [SD] difference LLM vs physician on 40-point bidirectional scale, 1.70 [14.24]; P = .46). CONCLUSIONS AND RELEVANCE/UNASSIGNED:Electronic health record-embedded LLM HCs required less editing than physician-generated HCs to approach a quality standard, resulting in HCs that were comparably or more complete, concise, and cohesive, but contained more confabulations. Despite the potential influence of artificial time constraints, this study supports the feasibility of a physician-LLM partnership for writing HCs and provides a basis for monitoring LLM HCs in clinical practice.
PMID: 40802185
ISSN: 2574-3805
CID: 5906762
Classifying Continuous Glucose Monitoring Documents From Electronic Health Records
Zheng, Yaguang; Iturrate, Eduardo; Li, Lehan; Wu, Bei; Small, William R; Zweig, Susan; Fletcher, Jason; Chen, Zhihao; Johnson, Stephen B
BACKGROUND:Clinical use of continuous glucose monitoring (CGM) is increasing storage of CGM-related documents in electronic health records (EHR); however, the standardization of CGM storage is lacking. We aimed to evaluate the sensitivity and specificity of CGM Ambulatory Glucose Profile (AGP) classification criteria. METHODS:We randomly chose 2244 (18.1%) documents from NYU Langone Health. Our document classification algorithm: (1) separated multiple-page documents into a single-page image; (2) rotated all pages into an upright orientation; (3) determined types of devices using optical character recognition; and (4) tested for the presence of particular keywords in the text. Two experts in using CGM for research and clinical practice conducted an independent manual review of 62 (2.8%) reports. We calculated sensitivity (correct classification of CGM AGP report) and specificity (correct classification of non-CGM report) by comparing the classification algorithm against manual review. RESULTS:Among 2244 documents, 1040 (46.5%) were classified as CGM AGP reports (43.3% FreeStyle Libre and 56.7% Dexcom), 1170 (52.1%) non-CGM reports (eg, progress notes, CGM request forms, or physician letters), and 34 (1.5%) uncertain documents. The agreement for the evaluation of the documents between the two experts was 100% for sensitivity and 98.4% for specificity. When comparing the classification result between the algorithm and manual review, the sensitivity and specificity were 95.0% and 91.7%. CONCLUSION/CONCLUSIONS:Nearly half of CGM-related documents were AGP reports, which are useful for clinical practice and diabetes research; however, the remaining half are other clinical documents. Future work needs to standardize the storage of CGM-related documents in the EHR.
PMCID:11904921
PMID: 40071848
ISSN: 1932-2968
CID: 5808452
How Point (Single-Probability) Tasks Are Affected by Probability Format, Part 2: A Making Numbers Meaningful Systematic Review
Ancker, Jessica S; Benda, Natalie C; Sharma, Mohit M; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Zikmund-Fisher, Brian J
UNLABELLED: HIGHLIGHTS/UNASSIGNED:Formatting a probability as 1 in X, using a foreground-only icon array, adding anecdotes to numbers, and gain-loss framing all affect probability perceptions and feelings.The evidence on communicating numbers to influence perceptions is far stronger than the evidence on using it to change health behavior or behavioral intention.Only weak evidence is available on patient preferences for verbal, graphical, and numerical probability formats.
PMCID:11848894
PMID: 39995775
ISSN: 2381-4683
CID: 5800662
How Difference Tasks Are Affected by Probability Format, Part 1: A Making Numbers Meaningful Systematic Review
Benda, Natalie C; Zikmund-Fisher, Brian J; Sharma, Mohit M; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Ancker, Jessica S
UNLABELLED: HIGHLIGHTS/UNASSIGNED:than with 1 in X rates.Adding graphics to probabilities helps readers compute differences between probabilities.
PMCID:11848882
PMID: 39995776
ISSN: 2381-4683
CID: 5800672
Scope, Methods, and Overview Findings for the Making Numbers Meaningful Evidence Review of Communicating Probabilities in Health: A Systematic Review
Ancker, Jessica S; Benda, Natalie C; Sharma, Mohit M; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Zikmund-Fisher, Brian J
UNLABELLED: HIGHLIGHTS/UNASSIGNED:The Making Numbers Meaningful project conducted a comprehensive systematic review of experimental and quasi-experimental research that compared 2 or more formats for presenting quantitative health information to patients or other lay audiences. The current article focuses on probability information.Based on a conceptual taxonomy, we reviewed studies based on the cognitive tasks required of participants, assessing 14 distinct possible outcomes.Our review identified 316 articles involving probability communications that generated 1,119 distinct research findings, each of which was reviewed by multiple experts for credibility.The overall pattern of findings highlights which probability communication questions have been well researched and which have not. For example, there has been far more research on communicating single probabilities than on communicating more complex information such as trends over time, and there has been a large amount of research on the effect of communication approaches on behavioral intentions but relatively little on behaviors.
PMCID:11848889
PMID: 39995784
ISSN: 2381-4683
CID: 5800712
How Difference Tasks Are Affected by Probability Format, Part 2: A Making Numbers Meaningful Systematic Review
Benda, Natalie C; Zikmund-Fisher, Brian J; Sharma, Mohit M; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Ancker, Jessica S
UNLABELLED: HIGHLIGHTS/UNASSIGNED:Communicating relative risk differences as opposed to absolute risk differences, using numerator-only instead of part-to-whole graphics, and including anecdotes or information about others' decisions will all increase intentions to engage in a behavior.Relative risks (rather than absolute risk differences) and numerator-only graphics (rather than part-to-whole) will also increase felt and perceived effectiveness.To illustrate probability differences, people tend to prefer bar charts over icon arrays and graphics with labels over those without.All findings regarding the impact of different presentation formats for probability differences on trust produced insufficient evidence.
PMCID:11907595
PMID: 40094048
ISSN: 2381-4683
CID: 5813012
How Point (Single-Probability) Tasks Are Affected by Probability Format, Part 1: A Making Numbers Meaningful Systematic Review
Ancker, Jessica S; Benda, Natalie C; Sharma, Mohit M; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Zikmund-Fisher, Brian J
UNLABELLED: HIGHLIGHTS/UNASSIGNED:Many researchers have studied the effects of data presentation formats of single probabilities on different outcomes.However, few findings are comparable enough to allow for strong evidence-based conclusions about the impact on identification, recall, contrast, categorization, and computation outcomes.
PMCID:11848880
PMID: 39995779
ISSN: 2381-4683
CID: 5800692
How Synthesis Tasks Are Affected by Probability Format: A Making Numbers Meaningful Systematic Review
Benda, Natalie C; Sharma, Mohit M; Ancker, Jessica S; Demetres, Michelle; Delgado, Diana; Johnson, Stephen B; Zikmund-Fisher, Brian J
UNLABELLED: HIGHLIGHTS/UNASSIGNED:This study found a moderate number of studies assessing strategies for evaluating sets of probabilities conveying information such as risks and benefits.Evidence is moderate that although presenting sets of probabilities in table versus sentences may not affect behavioral intentions, people may prefer tables.Contrary to previous studies about probability feelings, moderate evidence suggested that narratives may not affect effectiveness feelings.Evidence was insufficient to draw conclusions regarding contrast, identification, and trust outcomes, and no studies assessed recall, categorization, computation, or discrimination outcomes.
PMCID:11848887
PMID: 39995777
ISSN: 2381-4683
CID: 5800682
How Time-Trend Tasks Are Affected by Probability Format: A Making Numbers Meaningful Systematic Review
Sharma, Mohit M; Ancker, Jessica S; Benda, Natalie C; Johnson, Stephen B; Demetres, Michelle; Delgado, Diana; Zikmund-Fisher, Brian J
UNLABELLED: HIGHLIGHTS/UNASSIGNED:This systematic review found that few studies of probability trend data compared similar formats or used comparable outcome measures.The only strong piece of evidence was that graphing probabilities over longer time periods such that the distance between curves widens will tend to increase the perceived difference between the curves.Weak evidence suggests that survival curves (versus mortality curves) may make it easier to identify the option with the highest overall survival.Weak evidence suggests that graphing probabilities over longer (rather than shorter) time periods may increase the ability to distinguish between small survival differences.Evidence was insufficient to determine whether any format influenced behaviors or behavioral intentions.
PMCID:11848886
PMID: 39995781
ISSN: 2381-4683
CID: 5800702