Searched for: person:mauram01
in-biosketch:yes
Structure of a polymorphic repeat at the CACNA1C schizophrenia locus
Moya, Raquel; Wang, Xiaohan; Tsien, Richard W; Maurano, Matthew T
Genetic variation within intron 3 of the CACNA1C calcium channel gene is associated with schizophrenia and other neuropsychiatric disorders, but analysis of the causal variants and their effect is complicated by a nearby variable-number tandem repeat (VNTR). Here, we explored the structure and population variability of the CACNA1C intron 3 VNTR using 155 long-read genome assemblies from 78 diverse individuals. Based on sequence differences among repeat units, we clustered individual sequences into 7 VNTR structural alleles called Types. Three Types were related through large duplications, but the other Types diverged much earlier such that only 12 repeat units at the 5' end of the VNTR were shared across most Types. The most diverged Types were rare and present only in individuals with African ancestry, but a multiallelic structural polymorphism was present across populations at different frequencies, consistent with expansion of the VNTR preceding the emergence of early hominins. We demonstrated that this polymorphism was in complete linkage disequilibrium with fine-mapped schizophrenia variants from genome-wide association studies (GWAS) and that this risk haplotype was associated with decreased CACNA1C gene expression in the brain. Our work suggests that sequence variation within a human-specific VNTR affects gene expression and provides a detailed characterization of new alleles at a flagship neuropsychiatric GWAS locus.
PMID: 40932769
ISSN: 1091-6490
CID: 5934622
Genomic context sensitizes regulatory elements to genetic disruption
Ordoñez, Raquel; Zhang, Weimin; Ellis, Gwen; Zhu, Yinan; Ashe, Hannah J; Ribeiro-Dos-Santos, André M; Brosh, Ran; Huang, Emily; Hogan, Megan S; Boeke, Jef D; Maurano, Matthew T
Genomic context critically modulates regulatory function but is difficult to manipulate systematically. The murine insulin-like growth factor 2 (Igf2)/H19 locus is a paradigmatic model of enhancer selectivity, whereby CTCF occupancy at an imprinting control region directs downstream enhancers to activate either H19 or Igf2. We used synthetic regulatory genomics to repeatedly replace the native locus with 157-kb payloads, and we systematically dissected its architecture. Enhancer deletion and ectopic delivery revealed previously uncharacterized long-range regulatory dependencies at the native locus. Exchanging the H19 enhancer cluster with the Sox2 locus control region (LCR) showed that the H19 enhancers relied on their native surroundings while the Sox2 LCR functioned autonomously. Analysis of regulatory DNA actuation across cell types revealed that these enhancer clusters typify broader classes of context sensitivity genome wide. These results show that unexpected dependencies influence even well-studied loci, and our approach permits large-scale manipulation of complete loci to investigate the relationship between regulatory architecture and function.
PMID: 38759624
ISSN: 1097-4164
CID: 5658782
Synthetic regulatory genomics uncovers enhancer context dependence at the Sox2 locus
Brosh, Ran; Coelho, Camila; Ribeiro-Dos-Santos, André M; Ellis, Gwen; Hogan, Megan S; Ashe, Hannah J; Somogyi, Nicolette; Ordoñez, Raquel; Luther, Raven D; Huang, Emily; Boeke, Jef D; Maurano, Matthew T
Sox2 expression in mouse embryonic stem cells (mESCs) depends on a distal cluster of DNase I hypersensitive sites (DHSs), but their individual contributions and degree of interdependence remain a mystery. We analyzed the endogenous Sox2 locus using Big-IN to scarlessly integrate large DNA payloads incorporating deletions, rearrangements, and inversions affecting single or multiple DHSs, as well as surgical alterations to transcription factor (TF) recognition sequences. Multiple mESC clones were derived for each payload, sequence-verified, and analyzed for Sox2 expression. We found that two DHSs comprising a handful of key TF recognition sequences were each sufficient for long-range activation of Sox2 expression. By contrast, three nearby DHSs were entirely context dependent, showing no activity alone but dramatically augmenting the activity of the autonomous DHSs. Our results highlight the role of context in modulating genomic regulatory element function, and our synthetic regulatory genomics approach provides a roadmap for the dissection of other genomic loci.
PMCID:10081970
PMID: 36931273
ISSN: 1097-4164
CID: 5462642
Genomic context sensitivity of insulator function
Ribeiro-Dos-Santos, André M; Hogan, Megan S; Luther, Raven D; Brosh, Ran; Maurano, Matthew T
The specificity of interactions between genomic regulatory elements and potential target genes is influenced by the binding of insulator proteins such as CTCF, which can act as potent enhancer blockers when interposed between an enhancer and a promoter in a reporter assay. But not all CTCF sites genome-wide function as insulator elements, depending on cellular and genomic context. To dissect the influence of genomic context on enhancer blocker activity, we integrated reporter constructs with promoter-only, promoter and enhancer, and enhancer blocker configurations at hundreds of thousands of genomic sites using the Sleeping Beauty transposase. Deconvolution of reporter activity by genomic position reveals distinct expression patterns subject to genomic context, including a compartment of enhancer blocker reporter integrations with robust expression. The high density of integration sites permits quantitative delineation of characteristic genomic context sensitivity profiles, and their decomposition into sensitivity to both local and distant DNase I hypersensitive sites. Furthermore, using a single-cell expression approach to test the effect of integrated reporters for differential expression of nearby endogenous genes reveals that CTCF insulator elements do not completely abrogate reporter effects on endogenous gene expression. Collectively, our results lend new insight to genomic regulatory compartmentalization and its influence on the determinants of promoter-enhancer specificity.
PMID: 35082140
ISSN: 1549-5469
CID: 5154592
An effector index to predict target genes at GWAS loci
Forgetta, Vincenzo; Jiang, Lai; Vulpescu, Nicholas A; Hogan, Megan S; Chen, Siyuan; Morris, John A; Grinek, Stepan; Benner, Christian; Jang, Dong-Keun; Hoang, Quy; Burtt, Noel; Flannick, Jason A; McCarthy, Mark I; Fauman, Eric; Greenwood, Celia M T; Maurano, Matthew T; Richards, J Brent
Drug development and biological discovery require effective strategies to map existing genetic associations to causal genes. To approach this problem, we selected 12 common diseases and quantitative traits for which highly powered genome-wide association studies (GWAS) were available. For each disease or trait, we systematically curated positive control gene sets from Mendelian forms of the disease and from targets of medicines used for disease treatment. We found that these positive control genes were highly enriched in proximity of GWAS-associated single-nucleotide variants (SNVs). We then performed quantitative assessment of the contribution of commonly used genomic features, including open chromatin maps, expression quantitative trait loci (eQTL), and chromatin conformation data. Using these features, we trained and validated an Effector Index (Ei), to map target genes for these 12 common diseases and traits. Ei demonstrated high predictive performance, both with cross-validation on the training set, and an independently derived set for type 2 diabetes. Key predictive features included coding or transcript-altering SNVs, distance to gene, and open chromatin-based metrics. This work outlines a simple, understandable approach to prioritize genes at GWAS loci for functional follow-up and drug development, and provides a systematic strategy for prioritization of GWAS target genes.
PMID: 35147782
ISSN: 1432-1203
CID: 5156932
Sequencing identifies multiple early introductions of SARS-CoV-2 to the New York City Region
Maurano, Matthew T; Ramaswami, Sitharam; Zappile, Paul; Dimartino, Dacia; Boytard, Ludovic; Ribeiro-Dos-Santos, André M; Vulpescu, Nicholas A; Westby, Gael; Shen, Guomiao; Feng, Xiaojun; Hogan, Megan S; Ragonnet-Cronin, Manon; Geidelberg, Lily; Marier, Christian; Meyn, Peter; Zhang, Yutong; Cadley, John A; Ordoñez, Raquel; Luther, Raven; Huang, Emily; Guzman, Emily; Arguelles-Grande, Carolina; Argyropoulos, Kimon V; Black, Margaret; Serrano, Antonio; Call, Melissa E; Kim, Min Jae; Belovarac, Brendan; Gindin, Tatyana; Lytle, Andrew; Pinnell, Jared; Vougiouklakis, Theodore; Chen, John; Lin, Lawrence H; Rapkiewicz, Amy; Raabe, Vanessa; Samanovic, Marie I; Jour, George; Osman, Iman; Aguero-Rosenfeld, Maria; Mulligan, Mark J; Volz, Erik M; Cotzia, Paolo; Snuderl, Matija; Heguy, Adriana
Effective public response to a pandemic relies upon accurate measurement of the extent and dynamics of an outbreak. Viral genome sequencing has emerged as a powerful approach to link seemingly unrelated cases, and large-scale sequencing surveillance can inform on critical epi-demiological parameters. Here, we report the analysis of 864 SARS-CoV-2 sequences from cases in the New York City metropolitan area during the COVID-19 outbreak in Spring 2020. The majority of cases had no recent travel history or known exposure, and genetically linked cases were spread throughout the region. Comparison to global viral sequences showed that early transmission was most linked to cases from Europe. Our data are consistent with numerous seeds from multiple sources and a prolonged period of unrecognized community spreading. This work highlights the complementary role of genomic surveillance in addition to traditional epidemiological indicators.
PMID: 33093069
ISSN: 1549-5469
CID: 4642522
Systematic localization of common disease-associated variation in regulatory DNA
Maurano, Matthew T; Humbert, Richard; Rynes, Eric; Thurman, Robert E; Haugen, Eric; Wang, Hao; Reynolds, Alex P; Sandstrom, Richard; Qu, Hongzhu; Brody, Jennifer; Shafer, Anthony; Neri, Fidencio; Lee, Kristen; Kutyavin, Tanya; Stehling-Sun, Sandra; Johnson, Audra K; Canfield, Theresa K; Giste, Erika; Diegel, Morgan; Bates, Daniel; Hansen, R Scott; Neph, Shane; Sabo, Peter J; Heimfeld, Shelly; Raubitschek, Antony; Ziegler, Steven; Cotsapas, Chris; Sotoodehnia, Nona; Glass, Ian; Sunyaev, Shamil R; Kaul, Rajinder; Stamatoyannopoulos, John A
Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure-related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn's disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.
PMCID:3771521
PMID: 22955828
ISSN: 0036-8075
CID: 1354152
Mammalian genome writing: Unlocking new length scales for genome engineering
Pinglay, Sudarshan; Atwater, John T; Brosh, Ran; Shendure, Jay; Maurano, Matthew T; Boeke, Jef D
The ability to design and engineer mammalian genomes across arbitrary length scales would transform biology and medicine. Such capabilities would enable the systematic dissection of mechanisms governing gene regulation and the influence of complex haplotypes on human traits and disease. They would also facilitate the engineering of disease models that more faithfully recapitulate human physiology and of next-generation cell therapies harboring sophisticated genetic circuits. Over the past decade, advances in genome editing have made small, targeted modifications at single sites routine. However, achieving multiple coordinated alterations across long sequence windows (>10 kb) or installing large synthetic DNA segments in mammalian cells remains a major challenge. Recent advances in mammalian genome writing-the bottom-up design, assembly, and targeted integration of large custom DNA sequences, independent of any natural template-offer a potential solution. Here, we review key technological developments, highlight emerging applications, and discuss current bottlenecks and strategies for overcoming them.
PMID: 41576918
ISSN: 1097-4172
CID: 5988842
Iterative improvement of deep learning models using synthetic regulatory genomics
Ribeiro-Dos-Santos, André M; Maurano, Matthew T
Deep learning models can accurately reconstruct genome-wide epigenetic tracks from the reference genome sequence alone. But it is unclear what predictive power they have on sequence diverging from the reference, such as disease- and trait-associated variants or engineered sequences. Recent work has applied synthetic regulatory genomics to characterized dozens of deletions, inversions, and rearrangements of DNase I hypersensitive sites (DHSs). Here, we use the state-of-the-art model Enformer to predict DNA accessibility and RNA transcription across these engineered sequences when delivered at their endogenous loci. At a high level, we observe a good correlation between accessibility predicted by Enformer and experimental data. But model performance is best for sequences that more resembled the reference, such as single deletions or combinations of multiple DHSs. Predictive power is poorer for rearrangements affecting DHS order or orientation. We use these data to fine-tune Enformer, yielding significant reduction in prediction error. We show that this fine-tuning retains strong predictive performance for other tracks. Our results show that current deep learning models perform poorly when presented with novel sequences diverging in certain critical features from their training set. Thus, an iterative approach incorporating profiling of synthetic constructs can improve model generalizability and ultimately enable functional classification of regulatory variants identified by population studies.
PMID: 41125441
ISSN: 1549-5469
CID: 5956992
Genome writing to dissect consequences of SVA retrotransposon disease X-Linked Dystonia Parkinsonism
Zhang, Weimin; Zhao, Yu; Prakash, Priya; Appleby, Heather L; Barriball, Kelly; Capponi, Simona; Jiang, Qingwen; Wudzinska, Aleksandra M; Vaine, Christine A; Ellis, Gwen; Rahman, Neha; Markovic, Stefan; Mishkit, Orin; Limberg, Kerry C; Maurano, Matthew T; Wadghiri, Youssef Z; Kim, Sang Yong; Timmers, H T Marc; Bragg, D Cristopher; Liddelow, Shane A; Brosh, Ran; Boeke, Jef D
Human retrotransposon insertions are often associated with diseases. In the case of the neurodegenerative X-Linked Dystonia-Parkinsonism disease, a human-specific SINE-VNTR-Alu subfamily F retrotransposon was inserted in intron 32 of the TAF1 gene. Here, we genomically rewrote a portion of the mouse Taf1 allele with the corresponding 78-kb XDP patient derived TAF1 allele. In mESCs, the presence of the intronic SVAs-rather than the hybrid gene structure-reduces hyTAF1 levels. This leads to transcriptional downregulation of genes with TATA box enriched in their promoters and triggering apoptosis. Chromatin and transcriptome profiling revealed that intronic SVAs are actively transcribed, forming barriers that likely impede transcription elongation. In mice, neuronal lineage TAF1 humanization resulted lethality of male progeny within two months. XDP male mice had severe atrophy centered on the striatum-the same affected brain region in XDP patients. Lastly, CRISPRa-mediated activation of hyTAF1 restored mESC viability, suggesting boosting TAF1 transcription as a therapeutic approach.
PMCID:12632633
PMID: 41279153
ISSN: 2692-8205
CID: 5967852