Try a new search

Format these results:

Searched for:

person:ful01

Total Results:

12


Evaluation of liquid from the papanicolaou test and other liquid biopsies for the detection of endometrial and ovarian cancers [Note]

Wang, Y; Li, L; Douville, C; Cohen, J D; Yen, T -T; Kinde, I; Sundfelt, K; Kjaer, S K; Hruban, R H; Shih, I -M; Wang, T -L; Kurman, R J; Springer, S; Ptak, J; Popoli, M; Schaefer, J; Silliman, N; Dobbyn, L; Tanner, E J; Angarita, A; Lycke, M; Jochumsen, K; Afsari, B; Danilova, L; Levine, D A; Jardon, K; Zeng, X; Arseneau, J; Fu, L; Diaz, L A; Karchin, R; Tomasetti, C; Kinzler, K W; Vogelstein, B; Fader, A N; Gilbert, L; Papadopoulos, N
EMBASE:623738487
ISSN: 1533-9866
CID: 3287662

A population-based study of ethnicity and breast cancer stage at diagnosis in Ontario

Ginsburg, O M; Fischer, H D; Shah, B R; Lipscombe, L; Fu, L; Anderson, G M; Rochon, P A
BACKGROUND: Breast cancer stage at diagnosis is an important predictor of survival. Our goal was to compare breast cancer stage at diagnosis (by American Joint Committee on Cancer criteria) in Chinese and South Asian women with stage at diagnosis in the remaining general population in Ontario. METHODS: We used the Ontario population-based cancer registry to identify all women diagnosed with breast cancer during 2005-2010, and we applied a validated surname algorithm to identify South Asian and Chinese women. We used logistic regression to compare, for Chinese or South Asian women and for the remaining general population, the frequency of diagnoses at stage ii compared with stage i and stages ii-iv compared with stage i. RESULTS: The registry search identified 1304 Chinese women, 705 South Asian women, and 39,287 women in the remaining general population. The Chinese and South Asian populations were younger than the remaining population (mean: 54, 57, and 61 years respectively). Adjusted for age, South Asian women were more often diagnosed with breast cancer at stage ii than at stage i [odds ratio (or): 1.28; 95% confidence interval (ci): 1.08 to 1.51] or at stages ii-iv than at stage i (or: 1.27; 95% ci: 1.08 to 1.48); Chinese women were less likely to be diagnosed at stage ii than at stage i (or: 0.82; 95% ci: 0.72 to 0.92) or at stages ii-iv than at stage i (or: 0.73; 95% ci: 0.65 to 0.82). CONCLUSIONS: Breast cancers were diagnosed at a later stage in South Asian women and at an earlier stage in Chinese women than in the remaining population. A more detailed analysis of ethnocultural factors influencing breast screening uptake, retention, and care-seeking behavior might be needed to help inform and evaluate tailored health promotion activities.
PMCID:4399617
PMID: 25908908
ISSN: 1198-0052
CID: 2474042

A Comprehensive Empirical Comparison of Modern Supervised Classification and Feature Selection Methods for Text Categorization

Aphinyanaphongs, Yindalon; Fu, Lawrence D; Li, Zhiguo; Peskin, Eric R; Efstathiadis, Efstratios; Aliferis, Constantin F; Statnikov, Alexander
An important aspect to performing text categorization is selecting appropriate supervised classification and feature selection methods. A comprehensive benchmark is needed to inform best practices in this broad application field. Previous benchmarks have evaluated performance for a few supervised classification and feature selection methods and limited ways to optimize them. The present work updates prior benchmarks by increasing the number of classifiers and feature selection methods order of magnitude, including adding recently developed, state-of-the-art methods. Specifically, this study used 229 text categorization data sets/tasks, and evaluated 28 classification methods (both well-established and proprietary/commercial) and 19 feature selection methods according to 4 classification performance metrics. We report several key findings that will be helpful in establishing best methodological practices for text categorization.
ISI:000342346500002
ISSN: 2330-1643
CID: 1313832

Computer models for identifying instrumental citations in the biomedical literature

Fu, Lawrence D.; Aphinyanaphongs, Yindalon; Aliferis, Constantin F.
The most popular method for evaluating the quality of a scientific publication is citation count. This metric assumes that a citation is a positive indicator of the quality of the cited work. This assumption is not always true since citations serve many purposes. As a result, citation count is an indirect and imprecise measure of impact. If instrumental citations could be reliably distinguished from non-instrumental ones, this would readily improve the performance of existing citation-based metrics by excluding the non-instrumental citations. A citation was operationally defined as instrumental if either of the following was true: the hypothesis of the citing work was motivated by the cited work, or the citing work could not have been executed without the cited work. This work investigated the feasibility of developing computer models for automatically classifying citations as instrumental or non-instrumental. Instrumental citations were manually labeled, and machine learning models were trained on a combination of content and bibliometric features. The experimental results indicate that models based on content and bibliometric features are able to automatically classify instrumental citations with high predictivity (AUC = 0.86). Additional experiments using independent hold out data and prospective validation show that the models are generalizeable and can handle unseen cases. This work demonstrates that it is feasible to train computer models to automatically identify instrumental citations. C1 [Fu, Lawrence D.; Aphinyanaphongs, Yindalon] NYU Med Ctr, Ctr Hlth Informat & Bioinformat, Dept Med, New York, NY 10016 USA. [Aliferis, Constantin F.] NYU Med Ctr, Ctr Hlth Informat & Bioinformat, Dept Pathol, New York, NY 10016 USA
ISI:000327219900020
ISSN: 0138-9130
CID: 687922

Identifying unproven cancer treatments on the health web: addressing accuracy, generalizability and scalability

Aphinyanaphongs, Yin; Fu, Lawrence D; Aliferis, Constantin F
Building machine learning models that identify unproven cancer treatments on the Health Web is a promising approach for dealing with the dissemination of false and dangerous information to vulnerable health consumers. Aside from the obvious requirement of accuracy, two issues are of practical importance in deploying these models in real world applications. (a) Generalizability: The models must generalize to all treatments (not just the ones used in the training of the models). (b) Scalability: The models can be applied efficiently to billions of documents on the Health Web. First, we provide methods and related empirical data demonstrating strong accuracy and generalizability. Second, by combining the MapReduce distributed architecture and high dimensionality compression via Markov Boundary feature selection, we show how to scale the application of the models to WWW-scale corpora. The present work provides evidence that (a) a very small subset of unproven cancer treatments is sufficient to build a model to identify unproven treatments on the web; (b) unproven treatments use distinct language to market their claims and this language is learnable; (c) through distributed parallelization and state of the art feature selection, it is possible to prepare the corpora and build and apply models with large scalability.
PMCID:4162393
PMID: 23920640
ISSN: 0926-9630
CID: 484192

A comparison of evaluation metrics for biomedical journals, articles, and websites in terms of sensitivity to topic

Fu, Lawrence D; Aphinyanaphongs, Yindalon; Wang, Lily; Aliferis, Constantin F
Evaluating the biomedical literature and health-related websites for quality are challenging information retrieval tasks. Current commonly used methods include impact factor for journals, PubMed's clinical query filters and machine learning-based filter models for articles, and PageRank for websites. Previous work has focused on the average performance of these methods without considering the topic, and it is unknown how performance varies for specific topics or focused searches. Clinicians, researchers, and users should be aware when expected performance is not achieved for specific topics. The present work analyzes the behavior of these methods for a variety of topics. Impact factor, clinical query filters, and PageRank vary widely across different topics while a topic-specific impact factor and machine learning-based filter models are more stable. The results demonstrate that a method may perform excellently on average but struggle when used on a number of narrower topics. Topic-adjusted metrics and other topic robust methods have an advantage in such situations. Users of traditional topic-sensitive metrics should be aware of their limitations
PMCID:3143298
PMID: 21419864
ISSN: 1532-0480
CID: 135570

Trends and developments in bioinformatics in 2010: prospects and perspectives

Aliferis, C F; Alekseyenko, A V; Aphinyanaphongs, Y; Brown, S; Fenyo, D; Fu, L; Shen, S; Statnikov, A; Wang, J
OBJECTIVES: To survey major developments and trends in the field of Bioinformatics in 2010 and their relationships to those of previous years, with emphasis on long-term trends, on best practices, on quality of the science of informatics, and on quality of science as a function of informatics. METHODS: A critical review of articles in the literature of Bioinformatics over the past year. RESULTS: Our main results suggest that Bioinformatics continues to be a major catalyst for progress in Biology and Translational Medicine, as a consequence of new assaying technologies, most pre-dominantly Next Generation Sequencing, which are changing the landscape of modern biological and medical research. These assays critically depend on bioinformatics and have led to quick growth of corresponding informatics methods development. Clinical-grade molecular signatures are proliferating at a rapid rate. However, a highly publicized incident at a prominent university showed that deficiencies in informatics methods can lead to catastrophic consequences for important scientific projects. Developing evidence-driven protocols and best practices is greatly needed given how serious are the implications for the quality of translational and basic science. CONCLUSIONS: Several exciting new methods have appeared over the past 18 months, that open new roads for progress in bioinformatics methods and their impact in biomedicine. At the same time, the range of open problems of great significance is extensive, ensuring the vitality of the field for many years to come.
PMID: 21938341
ISSN: 0943-4747
CID: 174460

Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature

Fu, Lawrence D.; Aliferis, Constantin F.
The most popular method for judging the impact of biomedical articles is citation count which is the number of citations received. The most significant limitation of citation count is that it cannot evaluate articles at the time of publication since citations accumulate over time. This work presents computer models that accurately predict citation counts of biomedical publications within a deep horizon of 10 years using only predictive information available at publication time. Our experiments show that it is indeed feasible to accurately predict future citation counts with a mixture of content-based and bibliometric features using machine learning methods. The models pave the way for practical prediction of the long-term impact of publication, and their statistical analysis provides greater insight into citation behavior
BIOABSTRACTS:BACD201000391628
ISSN: 0138-9130
CID: 113731

Models for predicting and explaining citation count of biomedical articles

Fu, Lawrence D; Aliferis, Constantin
The single most important bibliometric criterion for judging the impact of biomedical papers and their authors work is the number of citations received which is commonly referred to as citation count. This metric however is unavailable until several years after publication time. In the present work, we build computer models that accurately predict citation counts of biomedical publications within a deep horizon of ten years using only predictive information available at publication time. Our experiments show that it is indeed feasible to accurately predict future citation counts with a mixture of content-based and bibliometric features using machine learning methods. The models pave the way for practical prediction of the long-term impact of publication, and their statistical analysis provides greater insight into citation behavior
PMCID:2656101
PMID: 18999029
ISSN: 1559-4076
CID: 104071

Correlation between number of beams and monitor units, in the context of VMAT and RapidArc, in IMRT [Meeting Abstract]

Fu, L; Das, Indra
ORIGINAL:0011319
ISSN: 0094-2405
CID: 2234822