Try a new search

Format these results:

Searched for:

person:statna01

Total Results:

60


Computational Methods for Unraveling Temporal Brain Connectivity Data

Ray, Bisakha; Statnikov, Alexander; Aliferis, Constantin
Brain science is a frontier research area with great promise for understanding, preventing, and treating multiple diseases affecting millions of patients. Its key task of reconstructing neuronal brain connectivity poses unique Big Data Analysis challenges distinct from those in clinical or "-omics" domains. Our goal is to understand the strengths and limitations of reconstruction algorithms, measure performance and its determinants, and ultimately enhance performance and applicability. We devised a set of experiments in a well-controlled setting using an established gold-standard based on calcium fluorescence time series recordings of thousands of neurons sampled from a previously validated neuronal model of complex time-varying causal neuronal connections. Following empirical testing of several state-of-the-art reconstruction algorithms, and using the best-performing algorithms, we constructed features of a classifier and predicted the presence or absence of connections using meta-learning. This approach combines information-theoretic, feature construction, and pattern recognition meta-learning methods to considerably improve the Area under ROC curve performance. Our data are very promising toward the feasibility of reliably reconstructing complex neuronal connectivity.
PMCID:4765656
PMID: 26958304
ISSN: 1942-597x
CID: 2023562

Quantitative forecasting of PTSD from early trauma responses: A Machine Learning application

Galatzer-Levy, Isaac R; Karstoft, Karen-Inge; Statnikov, Alexander; Shalev, Arieh Y
There is broad interest in predicting the clinical course of mental disorders from early, multimodal clinical and biological information. Current computational models, however, constitute a significant barrier to realizing this goal. The early identification of trauma survivors at risk of post-traumatic stress disorder (PTSD) is plausible given the disorder's salient onset and the abundance of putative biological and clinical risk indicators. This work evaluates the ability of Machine Learning (ML) forecasting approaches to identify and integrate a panel of unique predictive characteristics and determine their accuracy in forecasting non-remitting PTSD from information collected within10 days of a traumatic event. Data on event characteristics, emergency department observations, and early symptoms were collected in 957 trauma survivors, followed for fifteen months. An ML feature selection algorithm identified a set of predictors that rendered all others redundant. Support Vector Machines (SVMs) as well as other ML classification algorithms were used to evaluate the forecasting accuracy of i) ML selected features, ii) all available features without selection, and iii) Acute Stress Disorder (ASD) symptoms alone. SVM also compared the prediction of a) PTSD diagnostic status at 15 months to b) posterior probability of membership in an empirically derived non-remitting PTSD symptom trajectory. Results are expressed as mean Area Under Receiver Operating Characteristics Curve (AUC). The feature selection algorithm identified 16 predictors, present in >/=95% cross-validation trials. The accuracy of predicting non-remitting PTSD from that set (AUC = .77) did not differ from predicting from all available information (AUC = .78). Predicting from ASD symptoms was not better then chance (AUC = .60). The prediction of PTSD status was less accurate than that of membership in a non-remitting trajectory (AUC = .71). ML methods may fill a critical gap in forecasting PTSD. The ability to identify and integrate unique risk indicators makes this a promising approach for developing algorithms that infer probabilistic risk of chronic posttraumatic stress psychopathology based on complex sources of biological, psychological, and social information.
PMCID:4252741
PMID: 25260752
ISSN: 0022-3956
CID: 1259832

A Comprehensive Empirical Comparison of Modern Supervised Classification and Feature Selection Methods for Text Categorization

Aphinyanaphongs, Yindalon; Fu, Lawrence D; Li, Zhiguo; Peskin, Eric R; Efstathiadis, Efstratios; Aliferis, Constantin F; Statnikov, Alexander
An important aspect to performing text categorization is selecting appropriate supervised classification and feature selection methods. A comprehensive benchmark is needed to inform best practices in this broad application field. Previous benchmarks have evaluated performance for a few supervised classification and feature selection methods and limited ways to optimize them. The present work updates prior benchmarks by increasing the number of classifiers and feature selection methods order of magnitude, including adding recently developed, state-of-the-art methods. Specifically, this study used 229 text categorization data sets/tasks, and evaluated 28 classification methods (both well-established and proprietary/commercial) and 19 feature selection methods according to 4 classification performance metrics. We report several key findings that will be helpful in establishing best methodological practices for text categorization.
ISI:000342346500002
ISSN: 2330-1643
CID: 1313832

Information content and analysis methods for Multi-Modal High-Throughput Biomedical Data

Ray, Bisakha; Henaff, Mikael; Ma, Sisi; Efstathiadis, Efstratios; Peskin, Eric R; Picone, Marco; Poli, Tito; Aliferis, Constantin F; Statnikov, Alexander
The spectrum of modern molecular high-throughput assaying includes diverse technologies such as microarray gene expression, miRNA expression, proteomics, DNA methylation, among many others. Now that these technologies have matured and become increasingly accessible, the next frontier is to collect "multi-modal" data for the same set of subjects and conduct integrative, multi-level analyses. While multi-modal data does contain distinct biological information that can be useful for answering complex biology questions, its value for predicting clinical phenotypes and contributions of each type of input remain unknown. We obtained 47 datasets/predictive tasks that in total span over 9 data modalities and executed analytic experiments for predicting various clinical phenotypes and outcomes. First, we analyzed each modality separately using uni-modal approaches based on several state-of-the-art supervised classification and feature selection methods. Then, we applied integrative multi-modal classification techniques. We have found that gene expression is the most predictively informative modality. Other modalities such as protein expression, miRNA expression, and DNA methylation also provide highly predictive results, which are often statistically comparable but not superior to gene expression data. Integrative multi-modal analyses generally do not increase predictive signal compared to gene expression data.
PMCID:3961740
PMID: 24651673
ISSN: 2045-2322
CID: 852022

Computational Prediction of Neutralization Epitopes Targeted by Human Anti-V3 HIV Monoclonal Antibodies

Shmelkov, Evgeny; Krachmarov, Chavdar; Grigoryan, Arsen V; Pinter, Abraham; Statnikov, Alexander; Cardozo, Timothy
The extreme diversity of HIV-1 strains presents a formidable challenge for HIV-1 vaccine design. Although antibodies (Abs) can neutralize HIV-1 and potentially protect against infection, antibodies that target the immunogenic viral surface protein gp120 have widely variable and poorly predictable cross-strain reactivity. Here, we developed a novel computational approach, the Method of Dynamic Epitopes, for identification of neutralization epitopes targeted by anti-HIV-1 monoclonal antibodies (mAbs). Our data demonstrate that this approach, based purely on calculated energetics and 3D structural information, accurately predicts the presence of neutralization epitopes targeted by V3-specific mAbs 2219 and 447-52D in any HIV-1 strain. The method was used to calculate the range of conservation of these specific epitopes across all circulating HIV-1 viruses. Accurately identifying an Ab-targeted neutralization epitope in a virus by computational means enables easy prediction of the breadth of reactivity of specific mAbs across the diversity of thousands of different circulating HIV-1 variants and facilitates rational design and selection of immunogens mimicking specific mAb-targeted epitopes in a multivalent HIV-1 vaccine. The defined epitopes can also be used for the purpose of epitope-specific analyses of breakthrough sequences recorded in vaccine clinical trials. Thus, our study is a prototype for a valuable tool for rational HIV-1 vaccine design.
PMCID:3934971
PMID: 24587168
ISSN: 1932-6203
CID: 829652

De-Novo Learning of Genome-Scale Regulatory Networks in S. cerevisiae

Ma, Sisi; Kemmeren, Patrick; Gresham, David; Statnikov, Alexander
De-novo reverse-engineering of genome-scale regulatory networks is a fundamental problem of biological and translational research. One of the major obstacles in developing and evaluating approaches for de-novo gene network reconstruction is the absence of high-quality genome-scale gold-standard networks of direct regulatory interactions. To establish a foundation for assessing the accuracy of de-novo gene network reverse-engineering, we constructed high-quality genome-scale gold-standard networks of direct regulatory interactions in Saccharomyces cerevisiae that incorporate binding and gene knockout data. Then we used 7 performance metrics to assess accuracy of 18 statistical association-based approaches for de-novo network reverse-engineering in 13 different datasets spanning over 4 data types. We found that most reconstructed networks had statistically significant accuracies. We also determined which statistical approaches and datasets/data types lead to networks with better reconstruction accuracies. While we found that de-novo reverse-engineering of the entire network is a challenging problem, it is possible to reconstruct sub-networks around some transcription factors with good accuracy. The latter transcription factors can be identified by assessing their connectivity in the inferred networks. Overall, this study provides the gene network reverse-engineering community with a rigorous assessment of the accuracy of S. cerevisiae gene network reconstruction and variability in performance of various approaches for learning both the entire network and sub-networks around transcription factors.
PMCID:4162580
PMID: 25215507
ISSN: 1932-6203
CID: 1209492

Text classification for automatic detection of alcohol use-related tweets: A feasibility study

Chapter by: Aphinyanaphongs, Y; Ray, B; Statnikov, A; Krebs, P
in: 2014 IEEE 15th International Conference on Information Reuse and Integration by
Piscataway, NJ : IEEE, 2014
pp. 93-97
ISBN: 978-1-4799-5880-1
CID: 1515072

Elevated Peripheral Blood Leukocyte Inflammatory Gene Expression in Radiographic Progressors with Symptomatic Knee Osteoarthritis: NYU and OAI Cohorts. [Meeting Abstract]

Attur, Mukundan; Statnikov, Alexander; Samuels, Svetlana Krasnokutsky; Kraus, Virginia B; Jordan, Joanne; Mitchell, Braxton D; Yau, Michelle; Patel, Jyoti; Aliferis, Constantin F; Hochberg, Marc C; Samuels, Jonathan; Abramson, Steven B
ISI:000344384900082
ISSN: 2326-5205
CID: 2331222

Forecasting non-remitting ptsd symptom trajectory by advanced modeling methods [Meeting Abstract]

Galatzer-Levy, I; Karstoft, K -I; Freedman, S; Ankri, Y; Gilad, M; Statnikov, A; Shalev, A Y
Background: Predicting pathways to chronic PTSD has significant clinical and public health implications but current risk indicators are inconsistent and limited. Previously identified in a large cohort of recent trauma survivors (n=957) using Latent Growth Mixture Modeling, trajectories of PTSD symptoms from one-week to fifteenmonths, include Rapid Remission (56%), Slow Remission (27%), and Non-Remission (17%), the non-remission class comprising the majority PTSD cases at fifteen months, and not responding to cognitive behavioral therapy (CBT). Innovative approaches to forecasting membership in a nonremitting, treatment resistant class may improve our ability to identify, shortly after trauma exposure, survivors at high risk of developing PTSD. We tested the robustness of seven machine learning forecasting methods to predicting the non-remitting class from predictor variables collected during the days that followed trauma exposure. Methods: Consecutive trauma survivors admitted to a general hospital emergency department were screened and followed longitudinally and n=125 with Acute PTSD received efficient cognitive behavioral therapy within a month of the traumatic event. CBT was equally distributed among trajectory classes. Survivors were followed regardless and blindly of their participation in treatment. Markov boundary feature selection was used to identify a parsimonious set of trajectory predictors from 68 candidate variables. That set was than used to compare seven classification algorithms [two variants of linear Support Vector Machines (SVMs), polynomial SVMs, AdaBoost, Random Forests, Bayesian Logistic Regression (BBR) and Kernal Ridge Regression (KRR)] for their ability to build accurate multivariate classification models separating nonremission from other trajectories: Support Vector Machines (SVMs), two Optimized SVMs, AdaBoost, Random Forests, Bayesian Logistic Regression (BBR) and Kernel Ridge Regression (KRR). Results: Variables selected by Markov boundary method robustly predic!
EMBASE:71278278
ISSN: 0893-133x
CID: 752892

Interleukin-1 Receptor Antagonist (IL-1Ra) Plasma Levels Predict Radiographic Progression Of Symptomatic Knee Osteoarthritis Over 24 Months [Meeting Abstract]

Attur, Mukundan ; Statnikov, Alexander ; Samuels, Jonathan ; Krasnokutsky, Svetlana ; Greenberg, Jeffrey D. ; Li, Zhiguo ; Rybak, Leon ; Aliferis, Constantin F. ; Abramson, Steven B.
ISI:000325359204346
ISSN: 0004-3591
CID: 657592