Try a new search

Format these results:

Searched for:

in-biosketch:yes

person:sbj2002

Total Results:

127


The sublanguage of cross-coverage

Stetson, Peter D; Johnson, Stephen B; Scotch, Matthew; Hripcsak, George
At Columbia-Presbyterian Medical Center, free-text "Signout" notes are typed into the electronic record by clinicians for the purpose of cross-coverage. We plan to "unlock" information about adverse events contained in these notes in a subsequent project using Natural Language Processing (NLP). To better understand the requirements for parsing, Signout notes were compared to other common medical notes (ambulatory clinic notes and discharge summaries) on a series of quantitative metrics. They are shorter (mean length 59.25 words vs. 144.11 and 340.85 for ambulatory and discharge notes respectively) and use more abbreviations (26.88% vs. 20.07% and 3.57%). Despite being terser, Signout notes use less ambiguous abbreviations (8.34% vs. 9.09% and 18.02%). Differences were found using Relative Entropy and Squared Chi-square Distance in a novel fashion to compare these medical corpora. Signout notes appear to constitute a unique sublanguage of medicine. The implications for parsing free-text cross-coverage notes into coded medical data are discussed.
PMCID:2244148
PMID: 12463923
ISSN: 1531-605x
CID: 3585892

The cognitive demands of an innovative query user interface

Wang, Di; Kaufman, David R; Mendonca, Eneida A; Seol, Yoon-Hu; Johnson, Stephen B; Cimino, James J
Too often, online searches for health information are time consuming and produce results that are not sufficiently precise to answer clinicians' or patients' questions. The PERSIVAL project is designed to circumvent this problem by personalizing and tailoring searches and presentation to the demands of the user and the particular clinical context. This paper focuses on a cognitive evaluation of one component of this project, a Query User Interface (QUI). The study examines the system's ability to allow users to easily and intuitively express their information needs. We performed several analyses including a cognitive walkthrough of the interface and quantitative estimations of cognitive load. The paper also presents a preliminary analysis of usability testing. The analyses suggest that there are features in the QUI that contribute to a greater cognitive load and result in greater effort on the part of the subject. The results of usability testing are consistent with these findings. However, subjects found it to be relatively easy and intuitive to generate well-formed queries using the interface. This study contributed to the iterative design of the interface and to the next generation of the PERSIVAL system.
PMCID:2244191
PMID: 12463945
ISSN: 1531-605x
CID: 3585902

Medical Informatics Training and Research at Columbia University

Shortliffe, E H; Johnson, S B
PMID: 27706367
ISSN: 2364-0502
CID: 3650902

Accessing heterogeneous sources of evidence to answer clinical questions

Mendonça, E A; Cimino, J J; Johnson, S B; Seol, Y H
The large and rapidly growing number of information sources relevant to health care, and the increasing amounts of new evidence produced by researchers, are improving the access of professionals and students to valuable information. However, seeking and filtering useful, valid information can be still very difficult. An online information system that conducts searches based on individual patient data can have a beneficial influence on the particular patient's outcome and educate the healthcare worker. In this paper, we describe the underlying model for a system that aims to facilitate the search for evidence based on clinicians' needs. This paper reviews studies of information needs of clinicians, describes principles of information retrieval, and examines the role that standardized terminologies can play in the integration between a clinical system and literature resources, as well as in the information retrieval process. The paper also describes a model for a digital library system that supports the integration of clinical systems with online information sources, making use of information available in the electronic medical record to enhance searches and information retrieval. The model builds on several different, previously developed techniques to identify information themes that are relevant to specific clinical data. Using a framework of evidence-based practice, the system generates well-structured questions with the intent of enhancing information retrieval. We believe that by helping clinicians to pose well-structured clinical queries and including in them relevant information from individual patients' medical records, we can enhance information retrieval and thus can improve patient-care.
PMID: 11515415
ISSN: 1532-0464
CID: 3650672

Comparing syntactic complexity in medical and non-medical corpora

Campbell, D A; Johnson, S B
With the growing use of Natural Language Processing (NLP) techniques as solutions in Medical Informatics, the need to quickly and efficiently create the knowledge structures used by these systems has grown concurrently. Automatic discovery of a lexicon for use by an NLP system through machine learning will require information about the syntax of medical language. Understanding the syntactic differences between medical and non-medical corpora may allow more efficient acquisition of a lexicon. Three experiments designed to quantify the syntactic differences in medical and non-medical corpora were conducted. The results show that the syntax of medical language shows less variation than non-medical language and is likely simpler. The differences were great enough to question the applicability of general language tools on medical language. These differences may reduce the difficulty of some free text machine learning problems by capitalizing on the simpler nature of narrative medical syntax.
PMCID:2243419
PMID: 11825160
ISSN: 1531-605x
CID: 3650682

Using narrative reports to support a digital library

Mendonça, E A; Cimino, J J; Johnson, S B
The vast amount of information collected and stored in clinical systems can be a significant challenge in the integration of digital libraries and electronic medical records, especially the selection of clinical data to be used in the search, retrieval, and summarization processes. In this study, we describe the use of information retrieval measures with natural language processor output to identify critical information in narrative reports. Our hypothesis is that clinical data that occur often in narrative reports are less important to clinicians than findings that occur rarely. We used the information retrieval methods to analyze one year of discharge summaries. We then conducted a performance study, using physicians as subject. Results show that the methods can be used for filtering critical information from reports. Further studies need to be done on evaluation of the method based on an evaluation of the system performance in the context of a digital library.
PMCID:2243377
PMID: 11825230
ISSN: 1531-605x
CID: 3650692

Where are they now? CPR leaders assess their progress. Interview by Anne Zender [Interview]

Johnson, S B; Haug, P; Curtis, C; Defa, T; Davoren, B; Kolodner, R; Monroe, B
PMID: 11186620
ISSN: 1060-5487
CID: 3650662

An object-oriented taxonomy of medical data presentations

Starren, J; Johnson, S B
A variety of methods have been proposed for presenting medical data visually on computers. Discussion of and comparison among these methods have been hindered by a lack of consistent terminology. A taxonomy of medical data presentations based on object-oriented user interface principles is presented. Presentations are divided into five major classes-list, table, graph, icon, and generated text. These are subdivided into eight subclasses with simple inheritance and four subclasses with multiple inheritance. The various subclasses are reviewed and examples are provided. Issues critical to the development and evaluation of presentations are also discussed.
PMCID:61451
PMID: 10641959
ISSN: 1067-5027
CID: 3650652

Multicenter patient records research: security policies and tools

Behlen, F M; Johnson, S B
The expanding health information infrastructure offers the promise of new medical knowledge drawn from patient records. Such promise will never be fulfilled, however, unless researchers first address policy issues regarding the rights and interests of both the patients and the institutions who hold their records. In this article, the authors analyze the interests of patients and institutions in light of public policy and institutional needs. They conclude that the multicenter study, with Institutional Review Board approval of each study at each site, protects the interests of both. "Anonymity" is no panacea, since patient records are so rich in information that they can never be truly anonymous. Researchers must earn and respect the trust of the public, as responsible stewards of facts about patients' lives. The authors find that computer security tools are needed to administer multicenter patient records studies and describe simple approaches that can be implemented using commercial database products.
PMCID:61386
PMID: 10579601
ISSN: 1067-5027
CID: 3650642

A semantic lexicon for medical language processing

Johnson, S B
OBJECTIVE:Construction of a resource that provides semantic information about words and phrases to facilitate the computer processing of medical narrative. DESIGN/METHODS:Lexemes (words and word phrases) in the Specialist Lexicon were matched against strings in the 1997 Metathesaurus of the Unified Medical Language System (UMLS) developed by the National Library of Medicine. This yielded a "semantic lexicon," in which each lexeme is associated with one or more syntactic types, each of which can have one or more semantic types. The semantic lexicon was then used to assign semantic types to lexemes occurring in a corpus of discharge summaries (603,306 sentences). Lexical items with multiple semantic types were examined to determine whether some of the types could be eliminated, on the basis of usage in discharge summaries. A concordance program was used to find contrasting contexts for each lexeme that would reflect different semantic senses. Based on this evidence, semantic preference rules were developed to reduce the number of lexemes with multiple semantic types. RESULTS:Matching the Specialist Lexicon against the Metathesaurus produced a semantic lexicon with 75,711 lexical forms, 22,805 (30.1 percent) of which had two or more semantic types. Matching the Specialist Lexicon against one year's worth of discharge summaries identified 27,633 distinct lexical forms, 13,322 of which had at least one semantic type. This suggests that the Specialist Lexicon has about 79 percent coverage for syntactic information and 38 percent coverage for semantic information for discharge summaries. Of those lexemes in the corpus that had semantic types, 3,474 (12.6 percent) had two or more types. When semantic preference rules were applied to the semantic lexicon, the number of entries with multiple semantic types was reduced to 423 (1.5 percent). In the discharge summaries, occurrences of lexemes with multiple semantic types were reduced from 9.41 to 1.46 percent. CONCLUSION/CONCLUSIONS:Automatic methods can be used to construct a semantic lexicon from existing UMLS sources. This semantic information can aid natural language processing programs that analyze medical narrative, provided that lexemes with multiple semantic types are kept to a minimum. Semantic preference rules can be used to select semantic types that are appropriate to clinical reports. Further work is needed to increase the coverage of the semantic lexicon and to exploit contextual information when selecting semantic senses.
PMCID:61361
PMID: 10332654
ISSN: 1067-5027
CID: 3650582