Searched for: in-biosketch:yes
person:yanaii01
Natural RNA interference directs a heritable response to the environment
Schott, Daniel; Yanai, Itai; Hunter, Craig P
RNA interference can induce heritable gene silencing, but it remains unexplored whether similar mechanisms play a general role in responses to cues that occur in the wild. We show that transient, mild heat stress in the nematode Caenorhabditis elegans results in changes in messenger RNA levels that last for more than one generation. The affected transcripts are enriched for genes targeted by germline siRNAs downstream of the piRNA pathway, and worms defective for germline RNAi are defective for these heritable effects. Our results demonstrate that a specific siRNA pathway transmits information about variable environmental conditions between generations.
PMCID:4894413
PMID: 25552271
ISSN: 2045-2322
CID: 2049882
Gene length and expression level shape genomic novelties
Grishkevich, Vladislav; Yanai, Itai
Gene duplication and alternative splicing are important mechanisms in the production of genomic novelties. Previous work has shown that a gene's family size and the number of splice variants it produces are inversely related, although the underlying reason is not well understood. Here, we report that gene length and expression level together explain this relationship. We found that gene lengths correlate with both gene duplication and alternative splicing: Longer genes are less likely to produce duplicates and more likely to exhibit alternative splicing. We show that gene length is a dynamic property, increasing with evolutionary time--due in part to the insertions of transposable elements--and decreasing following partial gene duplications. However, gene length alone does not account for the relationship between alternative splicing and gene duplication. A gene's expression level appears both to impose a strong constraint on its length and to restrict gene duplications. Furthermore, high gene expression promotes alternative splicing, in particular for long genes, and alternatively, short genes with low expression levels have large gene families. Our analysis of the human and mouse genomes shows that gene length and expression level are primary genic properties that together account for the relationship between gene duplication and alternative splicing and bias the origin of novelties in the genome.
PMCID:4158763
PMID: 25015383
ISSN: 1549-5469
CID: 2049892
Seeing is believing: new methods for in situ single-cell transcriptomics [Comment]
Avital, Gal; Hashimshony, Tamar; Yanai, Itai
New methods employ RNA-seq to study single cells within complex tissues by in situ sequencing or mRNA capture from single photoactivated cells.
PMCID:4053714
PMID: 25000927
ISSN: 1474-760x
CID: 2049902
BLIND ordering of large-scale transcriptomic developmental timecourses
Anavy, Leon; Levin, Michal; Khair, Sally; Nakanishi, Nagayasu; Fernandez-Valverde, Selene L; Degnan, Bernard M; Yanai, Itai
RNA-Seq enables the efficient transcriptome sequencing of many samples from small amounts of material, but the analysis of these data remains challenging. In particular, in developmental studies, RNA-Seq is challenged by the morphological staging of samples, such as embryos, since these often lack clear markers at any particular stage. In such cases, the automatic identification of the stage of a sample would enable previously infeasible experimental designs. Here we present the 'basic linear index determination of transcriptomes' (BLIND) method for ordering samples comprising different developmental stages. The method is an implementation of a traveling salesman algorithm to order the transcriptomes according to their inter-relationships as defined by principal components analysis. To establish the direction of the ordered samples, we show that an appropriate indicator is the entropy of transcriptomic gene expression levels, which increases over developmental time. Using BLIND, we correctly recover the annotated order of previously published embryonic transcriptomic timecourses for frog, mosquito, fly and zebrafish. We further demonstrate the efficacy of BLIND by collecting 59 embryos of the sponge Amphimedon queenslandica and ordering their transcriptomes according to developmental stage. BLIND is thus useful in establishing the temporal order of samples within large datasets and is of particular relevance to the study of organisms with asynchronous development and when morphological staging is difficult.
PMID: 24504336
ISSN: 1477-9129
CID: 2049912
The genomic determinants of genotype x environment interactions in gene expression
Grishkevich, Vladislav; Yanai, Itai
Predicting phenotype from genotype is greatly complicated by the polygenic nature of most traits and by the complex interactions between phenotype and the environment. Here, we review recent whole-genome approaches to understand the underlying principles, mechanisms, and evolutionary impacts of genotype x environment (GxE) interactions, defined as genotype-specific phenotypic responses to different environments. There is accumulating evidence that GxE interactions are ubiquitous, accounting perhaps for the greater part of the phenotypic variation seen across genotypes. Such interactions appear to be the consequence of changes to upstream regulators as opposed to local changes to promoters. Moreover, genes are not equally likely to exhibit GxE interactions; promoter architecture, expression level, regulatory complexity, and essentiality correlate with the differential regulation of a gene by the environment. One implication of this correlation is that expression variation across genotypes alone could be used as a proxy for GxE interactions in those experimental cases where identifying environmental variation is costly or impossible.
PMID: 23769209
ISSN: 0168-9525
CID: 2049922
ELOPER: elongation of paired-end reads as a pre-processing tool for improved de novo genome assembly
Silver, David H; Ben-Elazar, Shay; Bogoslavsky, Alexei; Yanai, Itai
MOTIVATION: Paired-end sequencing resulting in gapped short reads is commonly used for de novo genome assembly. Assembly methods use paired-end sequences in a two-step process, first treating each read-end independently, only later invoking the pairing to join the contiguous assemblies (contigs) into gapped scaffolds. Here, we present ELOPER, a pre-processing tool for pair-end sequences that produces a better read library for assembly programs. RESULTS: ELOPER proceeds by simultaneously considering both ends of paired reads generating elongated reads. We show that ELOPER theoretically doubles read-lengths while halving the number of reads. We provide evidence that pre-processing read libraries using ELOPER leads to considerably improved assemblies as predicted from the Lander-Waterman model. AVAILABILITY: http://sourceforge.net/projects/eloper SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
PMID: 23603334
ISSN: 1367-4811
CID: 2049932
Spatial localization of co-regulated genes exceeds genomic gene clustering in the Saccharomyces cerevisiae genome
Ben-Elazar, Shay; Yakhini, Zohar; Yanai, Itai
While it has been long recognized that genes are not randomly positioned along the genome, the degree to which its 3D structure influences the arrangement of genes has remained elusive. In particular, several lines of evidence suggest that actively transcribed genes are spatially co-localized, forming transcription factories; however, a generalized systematic test has hitherto not been described. Here we reveal transcription factories using a rigorous definition of genomic structure based on Saccharomyces cerevisiae chromosome conformation capture data, coupled with an experimental design controlling for the primary gene order. We develop a data-driven method for the interpolation and the embedding of such datasets and introduce statistics that enable the comparison of the spatial and genomic densities of genes. Combining these, we report evidence that co-regulated genes are clustered in space, beyond their observed clustering in the context of gene order along the genome and show this phenomenon is significant for 64 out of 117 transcription factors. Furthermore, we show that those transcription factors with high spatially co-localized targets are expressed higher than those whose targets are not spatially clustered. Collectively, our results support the notion that, at a given time, the physical density of genes is intimately related to regulatory activity.
PMCID:3575811
PMID: 23303780
ISSN: 1362-4962
CID: 2369442
An introduction to high-throughput sequencing experiments: design and bioinformatics analysis
Normand, Rachelly; Yanai, Itai
The dramatic fall in the cost of DNA sequencing has revolutionized the experiments within reach in the life sciences. Here we provide an introduction for the domains of analyses possible using high-throughput sequencing, distinguishing between "counting" and "reading" applications. We discuss the steps in designing a high-throughput sequencing experiment, introduce the most widely used applications, and describe basic sequencing concepts. We review the various software programs available for many of the bioinformatics analysis required to make sense of the sequencing data. We hope that this introduction will be accessible to biologists with no previous background in bioinformatics, yet with a keen interest in applying the power of high-throughput sequencing in their research.
PMID: 23872966
ISSN: 1940-6029
CID: 2049952
Identifying functional links between genes by evolutionary transcriptomics
Silver, David H; Levin, Michal; Yanai, Itai
The ability to determine gene expression profiles across distant species presents a unique opportunity to identify functional relationships between genes. In particular, transcriptome data may help to distinguish whether genes with similar expression profiles are functionally related or independent. Recent studies on the evolution of gene expression have revealed a striking amount of divergence across strains and species, a notion which has hitherto not been brought to bear on the problem of detecting functional relationships between genes. Here, we introduce evo-links, a method by which a pair of genes are linked if their expression profiles are consistently more similar within species, while their individual conservation across species is low. We show that genes connected through evo-links are more enriched in known functional interactions than genes linked by conventional correlation measures. The network of linked genes further allows the identification of gene communities which reflect distinct functional pathways. We classified communities into major cell-types and derived a temporal developmental map of tissue specification in the nematode C. elegans. This map shows the sequential activation of the endoderm, body wall muscle, and neuronal tissues, and later the pharynx. We propose that as comparative transcriptomics becomes increasingly feasible, evo-links offer a robust method to detect functional relationships and disentangle developmental pathways in data lacking spatial resolution.
PMID: 22772133
ISSN: 1742-2051
CID: 2049962
CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification
Hashimshony, Tamar; Wagner, Florian; Sher, Noa; Yanai, Itai
High-throughput sequencing has allowed for unprecedented detail in gene expression analyses, yet its efficient application to single cells is challenged by the small starting amounts of RNA. We have developed CEL-Seq, a method for overcoming this limitation by barcoding and pooling samples before linearly amplifying mRNA with the use of one round of in vitro transcription. We show that CEL-Seq gives more reproducible, linear, and sensitive results than a PCR-based amplification method. We demonstrate the power of this method by studying early C. elegans embryonic development at single-cell resolution. Differential distribution of transcripts between sister cells is seen as early as the two-cell stage embryo, and zygotic expression in the somatic cell lineages is enriched for transcription factors. The robust transcriptome quantifications enabled by CEL-Seq will be useful for transcriptomic analyses of complex tissues containing populations of diverse cell types.
PMID: 22939981
ISSN: 2211-1247
CID: 2049972