Searched for: in-biosketch:yes
person:baeleg01
Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7
Rambaut, Andrew; Drummond, Alexei J; Xie, Dong; Baele, Guy; Suchard, Marc A
Bayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) plays a central role in understanding evolutionary history from molecular sequence data. Visualizing and analyzing the MCMC-generated samples from the posterior distribution is a key step in any non-trivial Bayesian inference. We present the software package Tracer (version 1.7) for visualizing and analyzing the MCMC trace files generated through Bayesian phylogenetic inference. Tracer provides kernel density estimation, multivariate visualization, demographic trajectory reconstruction, conditional posterior distribution summary, and more. Tracer is open-source and available at http://beast.community/tracer.
PMCID:6101584
PMID: 29718447
ISSN: 1076-836x
CID: 5170262
Recent advances in computational phylodynamics
Baele, Guy; Dellicour, Simon; Suchard, Marc A; Lemey, Philippe; Vrancken, Bram
Time-stamped, trait-annotated phylogenetic trees built from virus genome data are increasingly used for outbreak investigation and monitoring ongoing epidemics. This routinely involves reconstructing the spatial and demographic processes from large data sets to help unveil the patterns and drivers of virus spread. Such phylodynamic inferences can however become quite time-consuming as the dimensions of the data increase, which has led to a myriad of approaches that aim to tackle this complexity. To elucidate the current state of the art in the field of phylodynamics, we discuss recent developments in Bayesian inference and accompanying software, highlight methods for improving computational efficiency and relevant visualisation tools. As an alternative to fully Bayesian approaches, we touch upon conditional software pipelines that compromise between statistical coherence and turn-around-time, and we highlight the available software packages. Finally, we outline future directions that may facilitate the large-scale tracking of epidemics in near real time.
PMID: 30248578
ISSN: 1879-6265
CID: 5170292
Phylodynamic assessment of intervention strategies for the West African Ebola virus outbreak
Dellicour, Simon; Baele, Guy; Dudas, Gytis; Faria, Nuno R; Pybus, Oliver G; Suchard, Marc A; Rambaut, Andrew; Lemey, Philippe
Genetic analyses have provided important insights into Ebola virus spread during the recent West African outbreak, but their implications for specific intervention scenarios remain unclear. Here, we address this issue using a collection of phylodynamic approaches. We show that long-distance dispersal events were not crucial for epidemic expansion and that preventing viral lineage movement to any given administrative area would, in most cases, have had little impact. However, major urban areas were critical in attracting and disseminating the virus: preventing viral lineage movement to all three capitals simultaneously would have contained epidemic size to one-third. We also show that announcements of border closures were followed by a significant but transient effect on international virus dispersal. By quantifying the hypothetical impact of different intervention strategies, as well as the impact of barriers on dispersal frequency, our study illustrates how phylodynamic analyses can help to address specific epidemiological and outbreak control questions.
PMCID:5993714
PMID: 29884821
ISSN: 2041-1723
CID: 5170272
Phylogenetic Factor Analysis
Tolkoff, Max R; Alfaro, Michael E; Baele, Guy; Lemey, Philippe; Suchard, Marc A
Phylogenetic comparative methods explore the relationships between quantitative traits adjusting for shared evolutionary history. This adjustment often occurs through a Brownian diffusion process along the branches of the phylogeny that generates model residuals or the traits themselves. For high-dimensional traits, inferring all pair-wise correlations within the multivariate diffusion is limiting. To circumvent this problem, we propose phylogenetic factor analysis (PFA) that assumes a small unknown number of independent evolutionary factors arise along the phylogeny and these factors generate clusters of dependent traits. Set in a Bayesian framework, PFA provides measures of uncertainty on the factor number and groupings, combines both continuous and discrete traits, integrates over missing measurements and incorporates phylogenetic uncertainty with the help of molecular sequences. We develop Gibbs samplers based on dynamic programming to estimate the PFA posterior distribution, over 3-fold faster than for multivariate diffusion and a further order-of-magnitude more efficiently in the presence of latent traits. We further propose a novel marginal likelihood estimator for previously impractical models with discrete data and find that PFA also provides a better fit than multivariate diffusion in evolutionary questions in columbine flower development, placental reproduction transitions and triggerfish fin morphometry.
PMCID:5920329
PMID: 28950376
ISSN: 1076-836x
CID: 5170232
Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10
Suchard, Marc A; Lemey, Philippe; Baele, Guy; Ayres, Daniel L; Drummond, Alexei J; Rambaut, Andrew
The Bayesian Evolutionary Analysis by Sampling Trees (BEAST) software package has become a primary tool for Bayesian phylogenetic and phylodynamic inference from genetic sequence data. BEAST unifies molecular phylogenetic reconstruction with complex discrete and continuous trait evolution, divergence-time dating, and coalescent demographic models in an efficient statistical inference engine using Markov chain Monte Carlo integration. A convenient, cross-platform, graphical user interface allows the flexible construction of complex evolutionary analyses.
PMCID:6007674
PMID: 29942656
ISSN: 2057-1577
CID: 5170282
PhyloGeoTool: interactively exploring large phylogenies in an epidemiological context
Libin, Pieter; Vanden Eynden, Ewout; Incardona, Francesca; Nowé, Ann; Bezenchek, Antonia; Sönnerborg, Anders; Vandamme, Anne-Mieke; Theys, Kristof; Baele, Guy
Motivation/UNASSIGNED:Clinicians, health officials and researchers are interested in the epidemic spread of pathogens in both space and time to support the optimization of intervention measures and public health policies. Large sequence databases of virus sequences provide an interesting opportunity to study this spread through phylogenetic analysis. To infer knowledge from large phylogenetic trees, potentially encompassing tens of thousands of virus strains, an efficient method for data exploration is required. The clades that are visited during this exploration should be annotated with strain characteristics (e.g. transmission risk group, tropism, drug resistance profile) and their geographic context. Results/UNASSIGNED:PhyloGeoTool implements a visual method to explore large phylogenetic trees and to depict characteristics of strains and clades, including their geographic context, in an interactive way. PhyloGeoTool also provides the possibility to position new virus strains relative to the existing phylogenetic tree, allowing users to gain insight in the placement of such new strains without the need to perform a de novo reconstruction of the phylogeny. Availability and implementation/UNASSIGNED:https://github.com/rega-cev/phylogeotool (Freely available: open source software project). Contact/UNASSIGNED:phylogeotool@kuleuven.be. Supplementary information/UNASSIGNED:Supplementary data are available at Bioinformatics online.
PMCID:5860094
PMID: 28961923
ISSN: 1367-4811
CID: 5170242
Host Genetic Variation Does Not Determine Spatio-Temporal Patterns of European Bat 1 Lyssavirus
Troupin, Cécile; Picard-Meyer, Evelyne; Dellicour, Simon; Casademont, Isabelle; Kergoat, Lauriane; Lepelletier, Anthony; Dacheux, Laurent; Baele, Guy; Monchâtre-Leroy, Elodie; Cliquet, Florence; Lemey, Philippe; Bourhy, Hervé
The majority of bat rabies cases in Europe are attributed to European bat 1 lyssavirus (EBLV-1), circulating mainly in serotine bats (Eptesicus serotinus). Two subtypes have been defined (EBLV-1a and EBLV-1b), each associated with a different geographical distribution. In this study, we undertake a comprehensive sequence analysis based on 80 newly obtained EBLV-1 nearly complete genome sequences from nine European countries over a 45-year period to infer selection pressures, rates of nucleotide substitution, and evolutionary time scale of these two subtypes in Europe. Our results suggest that the current lineage of EBLV-1 arose in Europe ∼600 years ago and the virus has evolved at an estimated average substitution rate of ∼4.19×10-5 subs/site/year, which is among the lowest recorded for RNA viruses. In parallel, we investigate the genetic structure of French serotine bats at both the nuclear and mitochondrial level and find that they constitute a single genetic cluster. Furthermore, Mantel tests based on interindividual distances reveal the absence of correlation between genetic distances estimated between viruses and between host individuals. Taken together, this indicates that the genetic diversity observed in our E. serotinus samples does not account for EBLV-1a and -1b segregation and dispersal in Europe.
PMCID:5721339
PMID: 29165566
ISSN: 1759-6653
CID: 5170252
Adaptive MCMC in Bayesian phylogenetics: an application to analyzing partitioned data in BEAST
Baele, Guy; Lemey, Philippe; Rambaut, Andrew; Suchard, Marc A
Motivation/UNASSIGNED:Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. Results/UNASSIGNED:We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold. Availability and Implementation/UNASSIGNED:Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. Contact/UNASSIGNED:guy.baele@kuleuven.be. Supplementary information/UNASSIGNED:Supplementary data are available at Bioinformatics online.
PMCID:6044345
PMID: 28200071
ISSN: 1367-4811
CID: 5170192
A Relaxed Directional Random Walk Model for Phylogenetic Trait Evolution
Gill, Mandev S; Tung Ho, Lam Si; Baele, Guy; Lemey, Philippe; Suchard, Marc A
Understanding the processes that give rise to quantitative measurements associated with molecular sequence data remains an important issue in statistical phylogenetics. Examples of such measurements include geographic coordinates in the context of phylogeography and phenotypic traits in the context of comparative studies. A popular approach is to model the evolution of continuously varying traits as a Brownian diffusion process acting on a phylogenetic tree. However, standard Brownian diffusion is quite restrictive and may not accurately characterize certain trait evolutionary processes. Here, we relax one of the major restrictions of standard Brownian diffusion by incorporating a nontrivial estimable mean into the process. We introduce a relaxed directional random walk (RDRW) model for the evolution of multivariate continuously varying traits along a phylogenetic tree. Notably, the RDRW model accommodates branch-specific variation of directional trends while preserving model identifiability. Furthermore, our development of a computationally efficient dynamic programming approach to compute the data likelihood enables scaling of our method to large data sets frequently encountered in phylogenetic comparative studies and viral evolution. We implement the RDRW model in a Bayesian inference framework to simultaneously reconstruct the evolutionary histories of molecular sequence data and associated multivariate continuous trait data, and provide tools to visualize evolutionary reconstructions. We demonstrate the performance of our model on synthetic data, and we illustrate its utility in two viral examples. First, we examine the spatiotemporal spread of HIV-1 in central Africa and show that the RDRW model uncovers a clearer, more detailed picture of the dynamics of viral dispersal than standard Brownian diffusion. Second, we study antigenic evolution in the context of HIV-1 resistance to three broadly neutralizing antibodies. Our analysis reveals evidence of a continuous drift at the HIV-1 population level towards enhanced resistance to neutralization by the VRC01 monoclonal antibody over the course of the epidemic. [Brownian Motion; Diffusion Processes; Phylodynamics; Phylogenetics; Phylogeography; Trait Evolution.].
PMCID:6075548
PMID: 27798403
ISSN: 1076-836x
CID: 5170172
Virus genomes reveal factors that spread and sustained the Ebola epidemic
Dudas, Gytis; Carvalho, Luiz Max; Bedford, Trevor; Tatem, Andrew J; Baele, Guy; Faria, Nuno R; Park, Daniel J; Ladner, Jason T; Arias, Armando; Asogun, Danny; Bielejec, Filip; Caddy, Sarah L; Cotten, Matthew; D'Ambrozio, Jonathan; Dellicour, Simon; Di Caro, Antonino; Diclaro, Joseph W; Duraffour, Sophie; Elmore, Michael J; Fakoli, Lawrence S; Faye, Ousmane; Gilbert, Merle L; Gevao, Sahr M; Gire, Stephen; Gladden-Young, Adrianne; Gnirke, Andreas; Goba, Augustine; Grant, Donald S; Haagmans, Bart L; Hiscox, Julian A; Jah, Umaru; Kugelman, Jeffrey R; Liu, Di; Lu, Jia; Malboeuf, Christine M; Mate, Suzanne; Matthews, David A; Matranga, Christian B; Meredith, Luke W; Qu, James; Quick, Joshua; Pas, Suzan D; Phan, My V T; Pollakis, Georgios; Reusken, Chantal B; Sanchez-Lockhart, Mariano; Schaffner, Stephen F; Schieffelin, John S; Sealfon, Rachel S; Simon-Loriere, Etienne; Smits, Saskia L; Stoecker, Kilian; Thorne, Lucy; Tobin, Ekaete Alice; Vandi, Mohamed A; Watson, Simon J; West, Kendra; Whitmer, Shannon; Wiley, Michael R; Winnicki, Sarah M; Wohl, Shirlee; Wölfel, Roman; Yozwiak, Nathan L; Andersen, Kristian G; Blyden, Sylvia O; Bolay, Fatorma; Carroll, Miles W; Dahn, Bernice; Diallo, Boubacar; Formenty, Pierre; Fraser, Christophe; Gao, George F; Garry, Robert F; Goodfellow, Ian; Günther, Stephan; Happi, Christian T; Holmes, Edward C; Kargbo, Brima; Keïta, Sakoba; Kellam, Paul; Koopmans, Marion P G; Kuhn, Jens H; Loman, Nicholas J; Magassouba, N'Faly; Naidoo, Dhamari; Nichol, Stuart T; Nyenswah, Tolbert; Palacios, Gustavo; Pybus, Oliver G; Sabeti, Pardis C; Sall, Amadou; Ströher, Ute; Wurie, Isatta; Suchard, Marc A; Lemey, Philippe; Rambaut, Andrew
The 2013-2016 West African epidemic caused by the Ebola virus was of unprecedented magnitude, duration and impact. Here we reconstruct the dispersal, proliferation and decline of Ebola virus throughout the region by analysing 1,610 Ebola virus genomes, which represent over 5% of the known cases. We test the association of geography, climate and demography with viral movement among administrative regions, inferring a classic 'gravity' model, with intense dispersal between larger and closer populations. Despite attenuation of international dispersal after border closures, cross-border transmission had already sown the seeds for an international epidemic, rendering these measures ineffective at curbing the epidemic. We address why the epidemic did not spread into neighbouring countries, showing that these countries were susceptible to substantial outbreaks but at lower risk of introductions. Finally, we reveal that this large epidemic was a heterogeneous and spatially dissociated collection of transmission clusters of varying size, duration and connectivity. These insights will help to inform interventions in future epidemics.
PMCID:5712493
PMID: 28405027
ISSN: 1476-4687
CID: 5170222