Try a new search

Format these results:

Searched for:

in-biosketch:yes

person:sbj2002

Total Results:

132


Using narrative reports to support a digital library

Mendonça, E A; Cimino, J J; Johnson, S B
The vast amount of information collected and stored in clinical systems can be a significant challenge in the integration of digital libraries and electronic medical records, especially the selection of clinical data to be used in the search, retrieval, and summarization processes. In this study, we describe the use of information retrieval measures with natural language processor output to identify critical information in narrative reports. Our hypothesis is that clinical data that occur often in narrative reports are less important to clinicians than findings that occur rarely. We used the information retrieval methods to analyze one year of discharge summaries. We then conducted a performance study, using physicians as subject. Results show that the methods can be used for filtering critical information from reports. Further studies need to be done on evaluation of the method based on an evaluation of the system performance in the context of a digital library.
PMCID:2243377
PMID: 11825230
ISSN: 1531-605x
CID: 3650692

Where are they now? CPR leaders assess their progress. Interview by Anne Zender [Interview]

Johnson, S B; Haug, P; Curtis, C; Defa, T; Davoren, B; Kolodner, R; Monroe, B
PMID: 11186620
ISSN: 1060-5487
CID: 3650662

An object-oriented taxonomy of medical data presentations

Starren, J; Johnson, S B
A variety of methods have been proposed for presenting medical data visually on computers. Discussion of and comparison among these methods have been hindered by a lack of consistent terminology. A taxonomy of medical data presentations based on object-oriented user interface principles is presented. Presentations are divided into five major classes-list, table, graph, icon, and generated text. These are subdivided into eight subclasses with simple inheritance and four subclasses with multiple inheritance. The various subclasses are reviewed and examples are provided. Issues critical to the development and evaluation of presentations are also discussed.
PMCID:61451
PMID: 10641959
ISSN: 1067-5027
CID: 3650652

Multicenter patient records research: security policies and tools

Behlen, F M; Johnson, S B
The expanding health information infrastructure offers the promise of new medical knowledge drawn from patient records. Such promise will never be fulfilled, however, unless researchers first address policy issues regarding the rights and interests of both the patients and the institutions who hold their records. In this article, the authors analyze the interests of patients and institutions in light of public policy and institutional needs. They conclude that the multicenter study, with Institutional Review Board approval of each study at each site, protects the interests of both. "Anonymity" is no panacea, since patient records are so rich in information that they can never be truly anonymous. Researchers must earn and respect the trust of the public, as responsible stewards of facts about patients' lives. The authors find that computer security tools are needed to administer multicenter patient records studies and describe simple approaches that can be implemented using commercial database products.
PMCID:61386
PMID: 10579601
ISSN: 1067-5027
CID: 3650642

A semantic lexicon for medical language processing

Johnson, S B
OBJECTIVE:Construction of a resource that provides semantic information about words and phrases to facilitate the computer processing of medical narrative. DESIGN/METHODS:Lexemes (words and word phrases) in the Specialist Lexicon were matched against strings in the 1997 Metathesaurus of the Unified Medical Language System (UMLS) developed by the National Library of Medicine. This yielded a "semantic lexicon," in which each lexeme is associated with one or more syntactic types, each of which can have one or more semantic types. The semantic lexicon was then used to assign semantic types to lexemes occurring in a corpus of discharge summaries (603,306 sentences). Lexical items with multiple semantic types were examined to determine whether some of the types could be eliminated, on the basis of usage in discharge summaries. A concordance program was used to find contrasting contexts for each lexeme that would reflect different semantic senses. Based on this evidence, semantic preference rules were developed to reduce the number of lexemes with multiple semantic types. RESULTS:Matching the Specialist Lexicon against the Metathesaurus produced a semantic lexicon with 75,711 lexical forms, 22,805 (30.1 percent) of which had two or more semantic types. Matching the Specialist Lexicon against one year's worth of discharge summaries identified 27,633 distinct lexical forms, 13,322 of which had at least one semantic type. This suggests that the Specialist Lexicon has about 79 percent coverage for syntactic information and 38 percent coverage for semantic information for discharge summaries. Of those lexemes in the corpus that had semantic types, 3,474 (12.6 percent) had two or more types. When semantic preference rules were applied to the semantic lexicon, the number of entries with multiple semantic types was reduced to 423 (1.5 percent). In the discharge summaries, occurrences of lexemes with multiple semantic types were reduced from 9.41 to 1.46 percent. CONCLUSION/CONCLUSIONS:Automatic methods can be used to construct a semantic lexicon from existing UMLS sources. This semantic information can aid natural language processing programs that analyze medical narrative, provided that lexemes with multiple semantic types are kept to a minimum. Semantic preference rules can be used to select semantic types that are appropriate to clinical reports. Further work is needed to increase the coverage of the semantic lexicon and to exploit contextual information when selecting semantic senses.
PMCID:61361
PMID: 10332654
ISSN: 1067-5027
CID: 3650582

Use of the Extensible Stylesheet Language (XSL) for medical data transformation

Seol, Y H; Johnson, S B; Starren, J
Recently, the Extensible Markup Language (XML) has received growing attention as a simple but flexible mechanism to represent medical data. As XML-based markups become more common there will be an increasing need to transform data stored in one XML markup into another markup. The Extensible Stylesheet Language (XSL) is a stylesheet language for XML. Development of a new mammography reporting system created a need to convert XML output from the MEDLee natural language processing system into a format suitable for cross-patient reporting. This paper examines the capability of XSL as a rule specification language that supports the medical XML data transformation. A set of nine relevant transformations was identified: Filtering, Substitution, Specification, Aggregation, Merging, Splitting, Transposition, Push-down and Pull-up. XSL-based methods for implementing these transformations are presented. The strengths and limitations of XSL are discussed in the context of XML medical data transformation.
PMCID:2232783
PMID: 10566337
ISSN: 1531-605x
CID: 3650602

A technique for semantic classification of unknown words using UMLS resources

Campbell, D A; Johnson, S B
Natural Language Processing (NLP) is a tool for transforming natural text into codable form. Success of NLP systems is contingent on a well constructed semantic lexicon. However, creation and maintenance of these lexicons is difficult, costly and time consuming. The UMLS contains semantic and syntactic information of medical terms, which may be used to automate some of this task. Using UMLS resources we have observed that it is possible to define one semantic type by its syntactic combinations with other types in a corpus of discharge summaries. These patterns of combination can then be used to classify words which are not in the lexicon. The technique was applied to a corpus for a single semantic type and generated a list of 875 words which matched the classification criteria for that type. The words were ranked by number of patterns matched and the top 95 words were correctly typed with 80% accuracy.
PMCID:2232586
PMID: 10566453
ISSN: 1531-605x
CID: 3650622

Extended SQL for manipulating clinical warehouse data

Johnson, S B; Chatziantoniou, D
Health care institutions are beginning to collect large amounts of clinical data through patient care applications. Clinical data warehouses make these data available for complex analysis across patient records, benefiting administrative reporting, patient care and clinical research. Data gathered for patient care purposes are difficult to manipulate for analytic tasks; the schema presents conceptual difficulties for the analyst, and many queries perform poorly. An extension to SQL is presented that enables the analyst to designate groups of rows. These groups can then be manipulated and aggregated in various ways to solve a number of useful analytic problems. The extended SQL is concise and runs in linear time, while standard SQL requires multiple statements with polynomial performance. The extensions are extremely powerful for performing aggregations on large amounts of data, which is useful in clinical data mining applications.
PMCID:2232585
PMID: 10566474
ISSN: 1531-605x
CID: 3650632

Security architecture for multi-site patient records research

Behlen, F M; Johnson, S B
A security system was developed as part of a patient records research database project intended for both local and multi-site studies. A comprehensive review of ethical foundations and legal environment was undertaken, and a security system comprising both administrative policies and computer tools was developed. For multi-site studies, Institutional Review Board (IRB) approval is required for each study, at each participating site. A sponsoring Principal Investigator (PI) is required at each site, and each PI needs automated enforcement tools. Systems fitting this model were implemented at two academic medical centers. Security features of commercial database systems were found to be adequate for basic enforcement of approved research protocols.
PMCID:2232693
PMID: 10566404
ISSN: 1531-605x
CID: 3650612

Conceptual graph grammar--a simple formalism for sublanguage

Johnson, S B
There are a wide variety of computer applications that deal with various aspects of medical language: concept representation, controlled vocabulary, natural language processing, and information retrieval. While technical and theoretical methods appear to differ, all approaches investigate different aspects of the same phenomenon: medical sublanguage. This paper surveys the properties of medical sublanguage from a formal perspective, based on detailed analyses cited in the literature. A review of several computer systems based on sublanguage approaches shows some of the difficulties in addressing the interaction between the syntactic and semantic aspects of sublanguage. A formalism called Conceptual Graph Grammar is presented that attempts to combine both syntax and semantics into a single notation by extending standard Conceptual Graph notation. Examples from the domain of pathology diagnoses are provided to illustrate the use of this formalism in medical language analysis. The strengths and weaknesses of the approach are then considered. Conceptual Graph Grammar is an attempt to synthesize the common properties of different approaches to sublanguage into a single formalism, and to begin to define a common foundation for language-related research in medical informatics.
PMID: 9865032
ISSN: 0026-1270
CID: 3651152