Try a new search

Format these results:

Searched for:

person:shenhl01

in-biosketch:yes

Total Results:

24


FEAST: fast expectation-maximization for microbial source tracking

Shenhav, Liat; Thompson, Mike; Joseph, Tyler A; Briscoe, Leah; Furman, Ori; Bogumil, David; Mizrahi, Itzhak; Pe'er, Itsik; Halperin, Eran
A major challenge of analyzing the compositional structure of microbiome data is identifying its potential origins. Here, we introduce fast expectation-maximization microbial source tracking (FEAST), a ready-to-use scalable framework that can simultaneously estimate the contribution of thousands of potential source environments in a timely manner, thereby helping unravel the origins of complex microbial communities ( https://github.com/cozygene/FEAST ). The information gained from FEAST may provide insight into quantifying contamination, tracking the formation of developing microbial communities, as well as distinguishing and characterizing bacteria-related health conditions.
PMCID:8535041
PMID: 31182859
ISSN: 1548-7105
CID: 5266262

Modeling the temporal dynamics of the gut microbial community in adults and infants

Shenhav, Liat; Furman, Ori; Briscoe, Leah; Thompson, Mike; Silverman, Justin D; Mizrahi, Itzhak; Halperin, Eran
Given the highly dynamic and complex nature of the human gut microbial community, the ability to identify and predict time-dependent compositional patterns of microbes is crucial to our understanding of the structure and functions of this ecosystem. One factor that could affect such time-dependent patterns is microbial interactions, wherein community composition at a given time point affects the microbial composition at a later time point. However, the field has not yet settled on the degree of this effect. Specifically, it has been recently suggested that only a minority of taxa depend on the microbial composition in earlier times. To address the issue of identifying and predicting temporal microbial patterns we developed a new model, MTV-LMM (Microbial Temporal Variability Linear Mixed Model), a linear mixed model for the prediction of microbial community temporal dynamics. MTV-LMM can identify time-dependent microbes (i.e., microbes whose abundance can be predicted based on the previous microbial composition) in longitudinal studies, which can then be used to analyze the trajectory of the microbiome over time. We evaluated the performance of MTV-LMM on real and synthetic time series datasets, and found that MTV-LMM outperforms commonly used methods for microbiome time series modeling. Particularly, we demonstrate that the effect of the microbial composition in previous time points on the abundance of taxa at later time points is underestimated by a factor of at least 10 when applying previous approaches. Using MTV-LMM, we demonstrate that a considerable portion of the human gut microbiome, both in infants and adults, has a significant time-dependent component that can be predicted based on microbiome composition in earlier time points. This suggests that microbiome composition at a given time point is a major factor in defining future microbiome composition and that this phenomenon is considerably more common than previously reported for the human gut microbiome.
PMCID:6597035
PMID: 31246943
ISSN: 1553-7358
CID: 5266282

Re-examining the robustness of voice features in predicting depression: Compared with baseline of confounders

Pan, Wei; Flint, Jonathan; Shenhav, Liat; Liu, Tianli; Liu, Mingming; Hu, Bin; Zhu, Tingshao
A large proportion of Depression Disorder patients do not receive an effective diagnosis, which makes it necessary to find a more objective assessment to facilitate a more rapid and accurate diagnosis of depression. Speech data is easy to acquire clinically, its association with depression has been studied, although the actual predictive effect of voice features has not been examined. Thus, we do not have a general understanding of the extent to which voice features contribute to the identification of depression. In this study, we investigated the significance of the association between voice features and depression using binary logistic regression, and the actual classification effect of voice features on depression was re-examined through classification modeling. Nearly 1000 Chinese females participated in this study. Several different datasets was included as test set. We found that 4 voice features (PC1, PC6, PC17, PC24, P<0.05, corrected) made significant contribution to depression, and that the contribution effect of the voice features alone reached 35.65% (Nagelkerke's R2). In classification modeling, voice data based model has consistently higher predicting accuracy(F-measure) than the baseline model of demographic data when tested on different datasets, even across different emotion context. F-measure of voice features alone reached 81%, consistent with existing data. These results demonstrate that voice features are effective in predicting depression and indicate that more sophisticated models based on voice features can be built to help in clinical diagnosis.
PMCID:6586278
PMID: 31220113
ISSN: 1932-6203
CID: 5266272

Statistical Considerations in the Design and Analysis of Longitudinal Microbiome Studies

Silverman, Justin D; Shenhav, Liat; Halperin, Eran; Mukherjee, Sayan; David, Lawrence A
ORIGINAL:0016061
ISSN: 2692-8205
CID: 5340062

BayesCCE: a Bayesian framework for estimating cell-type composition from DNA methylation without the need for methylation reference

Rahmani, Elior; Schweiger, Regev; Shenhav, Liat; Wingert, Theodora; Hofer, Ira; Gabel, Eilon; Eskin, Eleazar; Halperin, Eran
We introduce a Bayesian semi-supervised method for estimating cell counts from DNA methylation by leveraging an easily obtainable prior knowledge on the cell-type composition distribution of the studied tissue. We show mathematically and empirically that alternative methods which attempt to infer cell counts without methylation reference only capture linear combinations of cell counts rather than provide one component per cell type. Our approach allows the construction of components such that each component corresponds to a single cell type, and provides a new opportunity to investigate cell compositions in genomic studies of tissues for which it was not possible before.
PMCID:6151042
PMID: 30241486
ISSN: 1474-760x
CID: 5266252

Using Stochastic Approximation Techniques to Efficiently Construct Confidence Intervals for Heritability

Schweiger, Regev; Fisher, Eyal; Rahmani, Elior; Shenhav, Liat; Rosset, Saharon; Halperin, Eran
Estimation of heritability is an important task in genetics. The use of linear mixed models (LMMs) to determine narrow-sense single-nucleotide polymorphism (SNP)-heritability and related quantities has received much recent attention, due of its ability to account for variants with small effect sizes. Typically, heritability estimation under LMMs uses the restricted maximum likelihood (REML) approach. The common way to report the uncertainty in REML estimation uses standard errors (SEs), which rely on asymptotic properties. However, these assumptions are often violated because of the bounded parameter space, statistical dependencies, and limited sample size, leading to biased estimates and inflated or deflated confidence intervals (CIs). In addition, for larger data sets (e.g., tens of thousands of individuals), the construction of SEs itself may require considerable time, as it requires expensive matrix inversions and multiplications. Here, we present FIESTA (Fast confidence IntErvals using STochastic Approximation), a method for constructing accurate CIs. FIESTA is based on parametric bootstrap sampling, and, therefore, avoids unjustified assumptions on the distribution of the heritability estimator. FIESTA uses stochastic approximation techniques, which accelerate the construction of CIs by several orders of magnitude, compared with previous approaches as well as to the analytical approximation used by SEs. FIESTA builds accurate CIs rapidly, for example, requiring only several seconds for data sets of tens of thousands of individuals, making FIESTA a very fast solution to the problem of building accurate CIs for heritability for all data set sizes.
PMID: 29932739
ISSN: 1557-8666
CID: 5266242

A Bayesian Framework for Estimating Cell Type Composition from DNA Methylation Without the Need for Methylation Reference

Rahmani, Elior; Schweiger, Regev; Shenhav, Liat; Wingert, Theodora; Hofer, Ira; Gabel, Eilon; Eskin, Eleazar; Helperin, Eran
ORIGINAL:0016064
ISSN: 2692-8205
CID: 5340092

GLINT: a user-friendly toolset for the analysis of high-throughput DNA-methylation array data

Rahmani, Elior; Yedidim, Reut; Shenhav, Liat; Schweiger, Regev; Weissbrod, Omer; Zaitlen, Noah; Halperin, Eran
SUMMARY/CONCLUSIONS:GLINT is a user-friendly command-line toolset for fast analysis of genome-wide DNA methylation data generated using the Illumina human methylation arrays. GLINT, which does not require any programming proficiency, allows an easy execution of Epigenome-Wide Association Study analysis pipeline under different models while accounting for known confounders in methylation data. AVAILABILITY AND IMPLEMENTATION/METHODS:GLINT is a command-line software, freely available at https://github.com/cozygene/glint/releases . It requires Python 2.7 and several freely available Python packages. Further information and documentation as well as a quick start tutorial are available at http://glint-epigenetics.readthedocs.io . CONTACT/BACKGROUND:elior.rahmani@gmail.com or ehalperin@cs.ucla.edu.
PMCID:5870777
PMID: 28177067
ISSN: 1367-4811
CID: 5266232

Genome-wide methylation data mirror ancestry information

Rahmani, Elior; Shenhav, Liat; Schweiger, Regev; Yousefi, Paul; Huen, Karen; Eskenazi, Brenda; Eng, Celeste; Huntsman, Scott; Hu, Donglei; Galanter, Joshua; Oh, Sam S; Waldenberger, Melanie; Strauch, Konstantin; Grallert, Harald; Meitinger, Thomas; Gieger, Christian; Holland, Nina; Burchard, Esteban G; Zaitlen, Noah; Halperin, Eran
BACKGROUND:Genetic data are known to harbor information about human demographics, and genotyping data are commonly used for capturing ancestry information by leveraging genome-wide differences between populations. In contrast, it is not clear to what extent population structure is captured by whole-genome DNA methylation data. RESULTS:-located SNPs. Based on these insights, we propose a method, EPISTRUCTURE, for the inference of ancestry from methylation data, without the need for genotype data. CONCLUSIONS:EPISTRUCTURE can be used to infer ancestry information of individuals based on their methylation data in the absence of corresponding genetic data. Although genetic data are often collected in epigenetic studies of large cohorts, these are typically not made publicly available, making the application of EPISTRUCTURE especially useful for anyone working on public data. Implementation of EPISTRUCTURE is available in GLINT, our recently released toolset for DNA methylation analysis at: http://glint-epigenetics.readthedocs.io.
PMCID:5267476
PMID: 28149326
ISSN: 1756-8935
CID: 5266222

Algorithmic Advances and Applications from RECOMB 2017 [Editorial]

Dao, Phuong; Kim, Yoo-Ah; Wojtowicz, Damian; Przytycka, Teresa M.; Madan, Sanna; Sharan, Roded; Zaccaria, Simone; El-Kebir, Mohammed; Raphael, Benjamin J.; Klau, Gunnar W.; Hristov, Borislav; Singh, Mona; Rajaraman, Ashok; Ma, Jian; Wang, Xiaoqian; Huang, Heng; Yan, Jingwen; Yao, Xiaohui; Kim, Sungeun; Nho, Kwangsik; Risacher, Shannon L.; Saykin, Andrew J.; Shen, Li; Schweiger, Regev; Fisher, Eyal; Halperin, Eran; Xu, Jinbo; Ojewole, Adegoke A.; Jou, Jonathan D.; Fowler, Vance G.; Donald, Bruce R.; Haussler, David; Smuga-Otto, Maciej; Paten, Benedict; Novak, Adam; Nikitin, Sergei; Zueva, Maria; Dmitrii, Miagkov; Mukherjee, Sudipto; Chaisson, Mark; Kannan, Sreeram; Eichler, Evan; Paten, Benedict; Novak, Adam; Garrison, Erik; Dawson, Eric; Hickey, Glenn; DeBlasio, Dan; Kececioglu, John; Shlemov, Alexander; Bankevich, Sergey; Bzikadze, Andrey; Safonova, Yana; Pevzner, Pavel; Rahmani, Elior; Shenhav, Liat; Eskin, Eleazar
ISI:000411874500007
ISSN: 2405-4712
CID: 5266402