Try a new search

Format these results:

Searched for:

in-biosketch:yes

person:sbj2002

Total Results:

127


Topological analysis of large-scale biomedical terminology structures

Bales, Michael E; Lussier, Yves A; Johnson, Stephen B
OBJECTIVE:To characterize global structural features of large-scale biomedical terminologies using currently emerging statistical approaches. DESIGN/METHODS:Given rapid growth of terminologies, this research was designed to address scalability. We selected 16 terminologies covering a variety of domains from the UMLS Metathesaurus, a collection of terminological systems. Each was modeled as a network in which nodes were atomic concepts and links were relationships asserted by the source vocabulary. For comparison against each terminology we created three random networks of equivalent size and density. MEASUREMENTS/METHODS:Average node degree, node degree distribution, clustering coefficient, average path length. RESULTS:Eight of 16 terminologies exhibited the small-world characteristics of a short average path length and strong local clustering. An overlapping subset of nine exhibited a power law distribution in node degrees, indicative of a scale-free architecture. We attribute these features to specific design constraints. Constraints on node connectivity, common in more synthetic classification systems, localize the effects of changes and deletions. In contrast, small-world and scale-free features, common in comprehensive medical terminologies, promote flexible navigation and less restrictive organic-like growth. CONCLUSION/CONCLUSIONS:While thought of as synthetic, grid-like structures, some controlled terminologies are structurally indistinguishable from natural language networks. This paradoxical result suggests that terminology structure is shaped not only by formal logic-based semantics, but by rules analogous to those that govern social networks and biological systems. Graph theoretic modeling shows early promise as a framework for describing terminology structure. Deeper understanding of these techniques may inform the development of scalable terminologies and ontologies.
PMCID:2213477
PMID: 17712094
ISSN: 1067-5027
CID: 3586182

Signout: a collaborative document with implications for the future of clinical information systems

Stein, Daniel M; Wrenn, Jesse O; Johnson, Stephen B; Stetson, Peter D
Signout is an unofficial clinical document used traditionally to facilitate patient handoff. Qualitative studies have suggested its importance in clinical care. We used a novel technique to quantify the use of signout by analyzing clinical information system logfiles. Viewing and editing events were collected for 1,677 unique patients admitted to our internal medicine service. We found the average patient's signout on a given day is viewed frequently (>6x) and edited frequently (>2x) with multiple unique viewers (>3) and editors (>1). We also found that signouts are used throughout a 24-hour period, not just at the time of handoff. Finally, we showed that they are viewed months and even years after their creation. Signout is therefore a highly utilized, collaborative, clinical document used for more than patient handoff. Our findings also suggest that clinical information systems may benefit from the introduction of collaborative tools such as subscription, versioning, and author-attribution utilities.
PMCID:2655880
PMID: 18693926
ISSN: 1942-597x
CID: 3586232

Assessing data relevance for automated generation of a clinical summary

Van Vleck, Tielman T; Stein, Daniel M; Stetson, Peter D; Johnson, Stephen B
Clinicians perform many tasks in their daily work requiring summarization of clinical data. However, as technology makes more data available, the challenges of data overload become ever more significant. As interoperable data exchange between hospitals becomes more common, there is an increased need for tools to summarize information. Our goal is to develop automated tools to aid clinical data summarization. Structured interviews were conducted on physicians to identify information from an electronic health record they considered relevant to explaining the patients medical history. Desirable data types were systematically evaluated using qualitative and quantitative analysis to assess data categories and patterns of data use. We report here on the implications of these results for the design of automated tools for summarization of patient history.
PMCID:2655814
PMID: 18693939
ISSN: 1942-597x
CID: 3586242

An unsupervised machine learning approach to segmentation of clinician-entered free text

Wrenn, Jesse O; Stetson, Peter D; Johnson, Stephen B
Natural language processing, an important tool in biomedicine, fails without successful segmentation of words and sentences. Tokenization is a form of segmentation that identifies boundaries separating semantic units, for example words, dates, numbers and symbols, within a text. We sought to construct a highly generalizeable tokenization algorithm with no prior knowledge of characters or their function, based solely on the inherent statistical properties of token and sentence boundaries. Tokenizing clinician-entered free text, we achieved precision and recall of 92% and 93%, respectively compared to a whitespace token boundary detection algorithm. We classified over 80% of punctuation characters correctly, based on manual disambiguation with high inter-rater agreement (kappa=0.916). Our algorithm effectively discovered properties of whitespace and punctuation in the corpus without prior knowledge of either. Given the dynamic nature of biomedical language, and the variety of distinct sublanguages used, the effectiveness and generalizability of our novel tokenization algorithm make it a valuable tool.
PMCID:2655800
PMID: 18693949
ISSN: 1942-597x
CID: 3586252

Feasibility study of speech recognition for gathering information needs

Natarajan, Karthik; Duffy, Robert F; Johnson, Stephen B; Mendonça, Eneida A
Automated speech recognition (ASR) is used in many areas of medicine today. However, not many studies have evaluated the usefulness of ASR applications for capturing clinician information needs in noisy environments. We evaluated 72 ASR transcribed clinician-generated questions and assessed them for semantic and syntactic errors. The results showed that basic user training is not sufficient in order to capture the semantics of recordings.
PMID: 18694157
ISSN: 1942-597x
CID: 3586262

Conceptual knowledge acquisition in biomedicine: A methodological review

Payne, Philip R O; Mendonça, Eneida A; Johnson, Stephen B; Starren, Justin B
The use of conceptual knowledge collections or structures within the biomedical domain is pervasive, spanning a variety of applications including controlled terminologies, semantic networks, ontologies, and database schemas. A number of theoretical constructs and practical methods or techniques support the development and evaluation of conceptual knowledge collections. This review will provide an overview of the current state of knowledge concerning conceptual knowledge acquisition, drawing from multiple contributing academic disciplines such as biomedicine, computer science, cognitive science, education, linguistics, semiotics, and psychology. In addition, multiple taxonomic approaches to the description and selection of conceptual knowledge acquisition and evaluation techniques will be proposed in order to partially address the apparent fragmentation of the current literature concerning this domain.
PMCID:2082059
PMID: 17482521
ISSN: 1532-0480
CID: 3586172

Bridging the gap between biological and clinical informatics in a graduate training program

Johnson, Stephen B; Friedman, Richard A
Several training programs in biomedical informatics in the United States are attempting to integrate biological and clinical informatics. However, significant differences in the cultures underlying these two disciplines pose barriers to a uniform educational solution. This paper recounts the experience at Columbia University in adapting a graduate program with an initial focus on clinical informatics to train bioinformaticians. The analysis begins by considering the development of the medical and biological informatics cultures over a 17-year period. Then we review how two separate curricula evolved to serve the needs of each group. Interviews with bioinformatics students and faculty indicated some dissatisfaction with the curriculum that developed within clinical informatics. Their comments are considered in the light of an analysis of the relationship between the application domains of biomedical informatics as a discipline. In response, a new curriculum was developed in which bioinformatics and clinical informatics are regarded as subdivisions of the same subject. A key feature of this curriculum is a new course, Theory and Methods in Biomedical Informatics, which presents informatics principles in their general form, and illustrates their application with examples drawn from across the biomedical spectrum. The paper concludes with suggestions for integrating informatics training programs at other institutions.
PMID: 16616697
ISSN: 1532-0480
CID: 3586092

A day in the life of a clinical research coordinator: observations from community practice settings

Khan, Sharib A; Kukafka, Rita; Payne, Philip R O; Bigger, J Thomas; Johnson, Stephen B
One of the goals of the NIH Roadmap Initiative is to re-engineer the national clinical research enterprise, with an emphasis on information technology solutions. Understanding end-users' workflow is critical to developing technology systems that are grounded in the context of the users' environment and are designed to fulfill their needs. Community practices are becoming the prevailing setting for conducting clinical research. Few studies have assessed clinical research workflow in such settings. We have conducted a series of investigations to model the workflow and have previously reported on some basic aspects of it, like the lack of information systems to support the workflow. In this paper we describe finer details of the workflow, using results of observational studies. These findings highlight the needs and inefficiencies that suggest the kind of information system that must be developed to enhance collaboration, communication and improve efficiency. This preliminary investigation also opens ground for more extensive studies to further elucidate the workflow.
PMID: 17911716
ISSN: 0926-9630
CID: 3586192

Reengineering clinical research with informatics

Chung, Thomas K; Kukafka, Rita; Johnson, Stephen B
The future success of the translational research spectrum depends on the clinical research enterprise's ability to break through the barriers that constrain its productivity. As more basic science discoveries emerge, our ability to effectively translate this knowledge into improved patient care rests squarely on the manner in which we answer clinical questions. Informatics--the science of effective information use--is poised to help advance the conduct of science. However, incorporating informatics into the enterprise comes with its own set of challenges. To harness the benefits of improved information use, it is important to first establish how information flows within research. A thoughtful implementation of informatics--one that factors in social and organizational nuances--will undoubtedly lead to a more efficient and effective clinical research enterprise.
PMID: 17134616
ISSN: 1081-5589
CID: 3586142

Graph theoretic modeling of large-scale semantic networks

Bales, Michael E; Johnson, Stephen B
During the past several years, social network analysis methods have been used to model many complex real-world phenomena, including social networks, transportation networks, and the Internet. Graph theoretic methods, based on an elegant representation of entities and relationships, have been used in computational biology to study biological networks; however they have not yet been adopted widely by the greater informatics community. The graphs produced are generally large, sparse, and complex, and share common global topological properties. In this review of research (1998-2005) on large-scale semantic networks, we used a tailored search strategy to identify articles involving both a graph theoretic perspective and semantic information. Thirty-one relevant articles were retrieved. The majority (28, 90.3%) involved an investigation of a real-world network. These included corpora, thesauri, dictionaries, large computer programs, biological neuronal networks, word association networks, and files on the Internet. Twenty-two of the 28 (78.6%) involved a graph comprised of words or phrases. Fifteen of the 28 (53.6%) mentioned evidence of small-world characteristics in the network investigated. Eleven (39.3%) reported a scale-free topology, which tends to have a similar appearance when examined at varying scales. The results of this review indicate that networks generated from natural language have topological properties common to other natural phenomena. It has not yet been determined whether artificial human-curated terminology systems in biomedicine share these properties. Large network analysis methods have potential application in a variety of areas of informatics, such as in development of controlled vocabularies and for characterizing a given domain.
PMID: 16442849
ISSN: 1532-0480
CID: 3586072