Searched for: in-biosketch:yes
person:sbj2002
Generic database design for patient management information
Johnson, S B; Paul, T; Khenina, A
Patient management information tracks general facts about the location of the patient and the providers assigned to care for the patient. The Clinical Data Repository at Columbia Presbyterian Medical Center employs a generic schema to record patient management events. The schema is extremely simple, yet can support several different views of patient information, as required by different applications: a longitudinal view of patient visits, including both inpatient and outpatient encounters; a visit-oriented view, to record facts related to a current encounter; a location-based view to provide a census of a nursing ward; and a provider-based view to give a list of the patients currently being cared for by a given clinician. All of these views can be supported in a highly efficient manner by the use of appropriate indexes.
PMCID:2233478
PMID: 9357581
ISSN: 1091-8280
CID: 3651132
Generic data modeling for clinical repositories
Johnson, S B
OBJECTIVE:To construct a large-scale clinical repository that accurately captures a detailed understanding of the data vital to the process of health care and that provides highly efficient access to patient information for the users of a clinical information system. DESIGN/METHODS:Conventional approaches to data modeling encourage the development of a highly specific data schema in order to capture as much information as possible about a given domain. In contrast, current database technology functions most effectively for clinical databases when a generic data schema is used. The technique of "generic data modeling" is presented as a method of reconciling these opposing views of clinical data, using formal operations to transform a detailed schema into a generic one. RESULTS:A complex schema consisting of hundreds of entities and representing a rich set of constraints about the patient care domain is transformed into a generic schema consisting of roughly two dozen tables. The resulting database design is efficient for patient-oriented queries and is highly flexible in adapting to the changing information needs of a health care institution, particularly changes involving the collection of new data elements. CONCLUSION/CONCLUSIONS:Conventional approaches to data modeling can be used to develop rich, complex models of clinical data that are useful for understanding and managing the process of patient care. Generic data modeling techniques can successfully transform a detailed design into an efficient generic design that is flexible enough to meet the needs of an operational clinical information system.
PMCID:116317
PMID: 8880680
ISSN: 1067-5027
CID: 3651112
Design of a clinical event monitor
Hripcsak, G; Clayton, P D; Jenders, R A; Cimino, J J; Johnson, S B
The issues and implementation of a clinical event monitor are described. An event monitor generates messages for providers, patients, and organizations based on clinical events and patient data. For example, an order for a medication might trigger the generation of a warning about a drug interaction. A model based on the active database literature has as its main components an event (which triggers a rule to fire), a condition (which tests whether an action ought to be performed), and an action (often the generation of a message). The details of implementing such a monitor are described, using as an example the Columbia-Presbyterian Medical Center clinical event monitor, which is based on the Arden Syntax for Medical Logic Modules.
PMID: 8812070
ISSN: 0010-4809
CID: 3651102
Integrating data from natural language processing into a clinical information system
Johnson, S B; Friedman, C
Demographic data extracted from discharge summaries by natural language processing was compared to data gathered by a conventional hospital admitting system. Discrepancies in data were noted in names, age, sex, race, and ethnicity. Some differences are attributable to errors in collection: interaction with patient, dictation, transcription, and data entry. Very few differences were due to errors in natural language processing. Other differences can be used to critique existing data, or to enhance data with more detailed information. Discrepancies in data as elementary as patient demographics raise the issue of resolving conflicts when neither source of data is known to be more reliable. Clinical repositories can represent conflicting data from multiple sources, but clinical information systems must bear the cost of increased complexity in the application programs that will use the data.
PMCID:2233157
PMID: 8947724
ISSN: 1091-8280
CID: 3651122
Unlocking clinical data from narrative reports: a study of natural language processing
Hripcsak, G; Friedman, C; Alderson, P O; DuMouchel, W; Johnson, S B; Clayton, P D
OBJECTIVE:To evaluate the automated detection of clinical conditions described in narrative reports. DESIGN/METHODS:Automated methods and human experts detected the presence or absence of six clinical conditions in 200 admission chest radiograph reports. STUDY SUBJECTS/METHODS:A computerized, general-purpose natural language processor; 6 internists; 6 radiologists; 6 lay persons; and 3 other computer methods. MAIN OUTCOME MEASURES/METHODS:Intersubject disagreement was quantified by "distance" (the average number of clinical conditions per report on which two subjects disagreed) and by sensitivity and specificity with respect to the physicians. RESULTS:Using a majority vote, physicians detected 101 conditions in the 200 reports (0.51 per report); the most common condition was acute bacterial pneumonia (prevalence, 0.14), and the least common was chronic obstructive pulmonary disease (prevalence, 0.03). Pairs of physicians disagreed on the presence of at least 1 condition for an average of 20% of reports. The average intersubject distance among physicians was 0.24 (95% Cl, 0.19 to 0.29) out of a maximum possible distance of 6. No physician had a significantly greater distance than the average. The average distance of the natural language processor from the physicians was 0.26 (Cl, 0.21 to 0.32; not significantly greater than the average among physicians). Lay persons and alternative computer methods had significantly greater distance from the physicians (all > 0.5). The natural language processor had a sensitivity of 81% (Cl, 73% to 87%) and a specificity of 98% (Cl, 97% to 99%); physicians had an average sensitivity of 85% and an average specificity of 98%. CONCLUSIONS:Physicians disagreed on the interpretation of narrative reports, but this was not caused by outlier physicians or a consistent difference in the way internists and radiologists read reports. The natural language processor was not distinguishable from the physicians and was superior to all other comparison subjects. Although the domain of this study was restricted (six clinical conditions in chest radiographs), natural language processing seems to have the potential to extract clinical information from narrative reports in a manner that will support automated decision-support and clinical research.
PMID: 7702231
ISSN: 0003-4819
CID: 3650972
Managing vocabulary for a centralized clinical system
Cimino, J J; Johnson, S B; Hripcsak, G; Hill, C L; Clayton, P D
The clinical computing environment at Columbia-Presbyterian Medical Center is organized around a centralized database of coded patient information collected from various ancillary sources. The Medical Entities Dictionary (MED) is the central repository for the controlled vocabulary used to encode the patient data. The MED is composed of terms used in the ancillary departments and, as such, changes in the source vocabularies must be maintained in the MED. The MED also contains some basic knowledge about the terms, and sophisticated maintenance tools have been developed that take advantage of this knowledge. This paper describes the success of the knowledge-based approach by describing the techniques used in two tasks: addition of a new vocabulary and maintenance of an existing one.
PMID: 8591133
ISSN: 1569-6332
CID: 3651092
Applying a controlled medical terminology to a distributed, production clinical information system
Forman, B H; Cimino, J J; Johnson, S B; Sengupta, S; Sideli, R; Clayton, P
To maximize the value of computerized medical records systems, an organizing structure is needed. That structure can be provided by a controlled medical terminology (CMT). At Columbia-Presbyterian Medical Center, we have been employing a controlled medical terminology, our Medical Entities Dictionary (MED), to mediate the storage and retrieval of patient data and enable decision support applications. This paper describes how the MED is actually used for data management in our distributed clinical information systems environment. Our system tools which access the MED for production purposes facilitate the mapping of terms from many sources to a uniform representation of concepts and also return information about the relationships between concepts. Applications which access a CMT for production purposes should be optimized for performance in high volume settings, fault tolerant, synchronizable, extensible, portable, and maintainable. We briefly describe our system architecture and then demonstrate how we utilize the MED for translation and semantic information as data is moved into and out of our patient database. We discuss our current tools and present a preview of the next generation of applications which will manage access to the MED for our production systems.
PMCID:2579127
PMID: 8563316
ISSN: 0195-4210
CID: 3651082
A data model that captures clinical reasoning about patient problems
Barrows, R C; Johnson, S B
We describe a data model that has been implemented for the CPMC Ambulatory Care System, and exemplify its function for patient problems. The model captures some nuances of clinical thinking about patients that are not accommodated in most other models, such as an evolution of clinical understanding about patient problems. A record of this understanding has clinical utility, and serves research interests as well as medical audit concerns. The model is described with an example, and advantages and limitations in the current implementation are discussed.
PMCID:2579123
PMID: 8563311
ISSN: 0195-4210
CID: 3651072
Medical decision support: experience with implementing the Arden Syntax at the Columbia-Presbyterian Medical Center
Jenders, R A; Hripcsak, G; Sideli, R V; DuMouchel, W; Zhang, H; Cimino, J J; Johnson, S B; Sherman, E H; Clayton, P D
We began implementation of a medical decision support system (MDSS) at the Columbia-Presbyterian Medical Center (CPMC) using the Arden Syntax in 1992. The Clinical Event Monitor which executes the Medical Logic Modules (MLMs) runs on a mainframe computer. Data are stored in a relational database and accessed via PL/I programs known as Data Access Modules (DAMs). Currently we have 18 clinical, 12 research and 10 administrative MLMs. On average, the clinical MLMs generate 50357 simple interpretations of laboratory data and 1080 alerts each month. The number of alerts actually read varies by subject of the MLM from 32.4% to 73.5%. Most simple interpretations are not read at all. A significant problem of MLMs is maintenance, and changes in laboratory testing and message output can impair MLM execution significantly. We are now using relational database technology and coded MLM output to study the process outcome of our MDSS.
PMCID:2579077
PMID: 8563259
ISSN: 0195-4210
CID: 3651062
A general natural-language text processor for clinical radiology
Friedman, C; Alderson, P O; Austin, J H; Cimino, J J; Johnson, S B
OBJECTIVE:Development of a general natural-language processor that identifies clinical information in narrative reports and maps that information into a structured representation containing clinical terms. DESIGN/METHODS:The natural-language processor provides three phases of processing, all of which are driven by different knowledge sources. The first phase performs the parsing. It identifies the structure of the text through use of a grammar that defines semantic patterns and a target form. The second phase, regularization, standardizes the terms in the initial target structure via a compositional mapping of multi-word phrases. The third phase, encoding, maps the terms to a controlled vocabulary. Radiology is the test domain for the processor and the target structure is a formal model for representing clinical information in that domain. MEASUREMENTS/METHODS:The impression sections of 230 radiology reports were encoded by the processor. Results of an automated query of the resultant database for the occurrences of four diseases were compared with the analysis of a panel of three physicians to determine recall and precision. RESULTS:Without training specific to the four diseases, recall and precision of the system (combined effect of the processor and query generator) were 70% and 87%. Training of the query component increased recall to 85% without changing precision.
PMCID:116194
PMID: 7719797
ISSN: 1067-5027
CID: 3650982