Searched for: in-biosketch:yes
person:mauram01
BEDOPS: high-performance genomic feature operations
Neph, Shane; Kuehn, M Scott; Reynolds, Alex P; Haugen, Eric; Thurman, Robert E; Johnson, Audra K; Rynes, Eric; Maurano, Matthew T; Vierstra, Jeff; Thomas, Sean; Sandstrom, Richard; Humbert, Richard; Stamatoyannopoulos, John A
The large and growing number of genome-wide datasets highlights the need for high-performance feature analysis and data comparison methods, in addition to efficient data storage and retrieval techniques. We introduce BEDOPS, a software suite for common genomic analysis tasks which offers improved flexibility, scalability and execution time characteristics over previously published packages. The suite includes a utility to compress large inputs into a lossless format that can provide greater space savings and faster data extractions than alternatives. AVAILABILITY: http://code.google.com/p/bedops/ includes binaries, source and documentation.
PMCID:3389768
PMID: 22576172
ISSN: 1367-4803
CID: 1354202
Widespread plasticity in CTCF occupancy linked to DNA methylation
Wang, Hao; Maurano, Matthew T; Qu, Hongzhu; Varley, Katherine E; Gertz, Jason; Pauli, Florencia; Lee, Kristen; Canfield, Theresa; Weaver, Molly; Sandstrom, Richard; Thurman, Robert E; Kaul, Rajinder; Myers, Richard M; Stamatoyannopoulos, John A
CTCF is a ubiquitously expressed regulator of fundamental genomic processes including transcription, intra- and interchromosomal interactions, and chromatin structure. Because of its critical role in genome function, CTCF binding patterns have long been assumed to be largely invariant across different cellular environments. Here we analyze genome-wide occupancy patterns of CTCF by ChIP-seq in 19 diverse human cell types, including normal primary cells and immortal lines. We observed highly reproducible yet surprisingly plastic genomic binding landscapes, indicative of strong cell-selective regulation of CTCF occupancy. Comparison with massively parallel bisulfite sequencing data indicates that 41% of variable CTCF binding is linked to differential DNA methylation, concentrated at two critical positions within the CTCF recognition sequence. Unexpectedly, CTCF binding patterns were markedly different in normal versus immortal cells, with the latter showing widespread disruption of CTCF binding associated with increased methylation. Strikingly, this disruption is accompanied by up-regulation of CTCF expression, with the result that both normal and immortal cells maintain the same average number of CTCF occupancy sites genome-wide. These results reveal a tight linkage between DNA methylation and the global occupancy patterns of a major sequence-specific regulatory factor.
PMCID:3431485
PMID: 22955980
ISSN: 1088-9051
CID: 1354192
The accessible chromatin landscape of the human genome
Thurman, Robert E; Rynes, Eric; Humbert, Richard; Vierstra, Jeff; Maurano, Matthew T; Haugen, Eric; Sheffield, Nathan C; Stergachis, Andrew B; Wang, Hao; Vernot, Benjamin; Garg, Kavita; John, Sam; Sandstrom, Richard; Bates, Daniel; Boatman, Lisa; Canfield, Theresa K; Diegel, Morgan; Dunn, Douglas; Ebersol, Abigail K; Frum, Tristan; Giste, Erika; Johnson, Audra K; Johnson, Ericka M; Kutyavin, Tanya; Lajoie, Bryan; Lee, Bum-Kyu; Lee, Kristen; London, Darin; Lotakis, Dimitra; Neph, Shane; Neri, Fidencio; Nguyen, Eric D; Qu, Hongzhu; Reynolds, Alex P; Roach, Vaughn; Safi, Alexias; Sanchez, Minerva E; Sanyal, Amartya; Shafer, Anthony; Simon, Jeremy M; Song, Lingyun; Vong, Shinny; Weaver, Molly; Yan, Yongqi; Zhang, Zhancheng; Zhang, Zhuzhu; Lenhard, Boris; Tewari, Muneesh; Dorschner, Michael O; Hansen, R Scott; Navas, Patrick A; Stamatoyannopoulos, George; Iyer, Vishwanath R; Lieb, Jason D; Sunyaev, Shamil R; Akey, Joshua M; Sabo, Peter J; Kaul, Rajinder; Furey, Terrence S; Dekker, Job; Crawford, Gregory E; Stamatoyannopoulos, John A
DNase I hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify approximately 2.9 million DHSs that encompass virtually all known experimentally validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. We connect approximately 580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is organized with dozens to hundreds of co-activated elements, and the transcellular DNase I sensitivity pattern at a given region can predict cell-type-specific functional behaviours. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation.
PMCID:3721348
PMID: 22955617
ISSN: 0028-0836
CID: 1354172
An expansive human regulatory lexicon encoded in transcription factor footprints
Neph, Shane; Vierstra, Jeff; Stergachis, Andrew B; Reynolds, Alex P; Haugen, Eric; Vernot, Benjamin; Thurman, Robert E; John, Sam; Sandstrom, Richard; Johnson, Audra K; Maurano, Matthew T; Humbert, Richard; Rynes, Eric; Wang, Hao; Vong, Shinny; Lee, Kristen; Bates, Daniel; Diegel, Morgan; Roach, Vaughn; Dunn, Douglas; Neri, Jun; Schafer, Anthony; Hansen, R Scott; Kutyavin, Tanya; Giste, Erika; Weaver, Molly; Canfield, Theresa; Sabo, Peter; Zhang, Miaohua; Balasundaram, Gayathri; Byron, Rachel; MacCoss, Michael J; Akey, Joshua M; Bender, M A; Groudine, Mark; Kaul, Rajinder; Stamatoyannopoulos, John A
Regulatory factor binding to genomic DNA protects the underlying sequence from cleavage by DNase I, leaving nucleotide-resolution footprints. Using genomic DNase I footprinting across 41 diverse cell and tissue types, we detected 45 million transcription factor occupancy events within regulatory regions, representing differential binding to 8.4 million distinct short sequence elements. Here we show that this small genomic sequence compartment, roughly twice the size of the exome, encodes an expansive repertoire of conserved recognition sequences for DNA-binding proteins that nearly doubles the size of the human cis-regulatory lexicon. We find that genetic variants affecting allelic chromatin states are concentrated in footprints, and that these elements are preferentially sheltered from DNA methylation. High-resolution DNase I cleavage patterns mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces, indicating that transcription factor structure has been evolutionarily imprinted on the human genome sequence. We identify a stereotyped 50-base-pair footprint that precisely defines the site of transcript origination within thousands of human promoters. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency.
PMCID:3736582
PMID: 22955618
ISSN: 0028-0836
CID: 1354162