Searched for: in-biosketch:yes
person:wangc22
Indirect effect inference and application to GAW20 data
Li, Liming; Wang, Chan; Lu, Tianyuan; Lin, Shili; Hu, Yue-Qing
BACKGROUND:Association studies using a single type of omics data have been successful in identifying disease-associated genetic markers, but the underlying mechanisms are unaddressed. To provide a possible explanation of how these genetic factors affect the disease phenotype, integration of multiple omics data is needed. RESULTS:We propose a novel method, LIPID (likelihood inference proposal for indirect estimation), that uses both single nucleotide polymorphism (SNP) and DNA methylation data jointly to analyze the association between a trait and SNPs. The total effect of SNPs is decomposed into direct and indirect effects, where the indirect effects are the focus of our investigation. Simulation studies show that LIPID performs better in various scenarios than existing methods. Application to the GAW20 data also leads to encouraging results, as the genes identified appear to be biologically relevant to the phenotype studied. CONCLUSIONS:The proposed LIPID method is shown to be meritorious in extensive simulations and in real-data analyses.
PMCID:6157197
PMID: 30255768
ISSN: 1471-2156
CID: 5686552
Detecting multiple variants associated with disease based on sequencing data of case-parent trios
Wang, Chan; Sun, Leiming; Zheng, Haitao; Hu, Yue-Qing
With the advance of next-generation sequencing technology, the rare variants join the common ones in explaining more proportions of heritability. The coexistence of variants of common with rare, causal with neutral and deleterious with protective is a norm and should be appropriately addressed. Some existing methods suffer from low power when one or more forms of coexistence present, impeding their applications in practice. In this paper, for case-parent trios, pseudocontrols are constructed using the nontransmitted alleles of the parents. The Kullback-Leibler divergence is utilized to measure the difference between the distributions of variants in a genetic region for the affected children and pseudocontrols, and two nonparametric test statistics KLTT and cKLTT are proposed. Extensive simulations show that they are robust to the opposite directions of the causal variants and the amount of neutral variants, and have superiority over the existing methods when both rare and common variants are involved. Furthermore, their efficiency is demonstrated in the application to the data from Framingham Heart Study.
PMID: 27278787
ISSN: 1435-232x
CID: 5686532
Utilizing mutual information for detecting rare and common variants associated with a categorical trait
Sun, Leiming; Wang, Chan; Hu, Yue-Qing
Background. Genome-wide association studies have succeeded in detecting novel common variants which associate with complex diseases. As a result of the fast changes in next generation sequencing technology, a large number of sequencing data are generated, which offers great opportunities to identify rare variants that could explain a larger proportion of missing heritability. Many effective and powerful methods are proposed, although they are usually limited to continuous, dichotomous or ordinal traits. Notice that traits having nominal categorical features are commonly observed in complex diseases, especially in mental disorders, which motivates the incorporation of the characteristics of the categorical trait into association studies with rare and common variants. Methods. We construct two simple and intuitive nonparametric tests, MIT and aMIT, based on mutual information for detecting association between genetic variants in a gene or region and a categorical trait. MIT and aMIT can gauge the difference among the distributions of rare and common variants across a region given every categorical trait value. If there is little association between variants and a categorical trait, MIT or aMIT approximately equals zero. The larger the difference in distributions, the greater values MIT and aMIT have. Therefore, MIT and aMIT have the potential for detecting functional variants. Results.We checked the validity of proposed statistics and compared them to the existing ones through extensive simulation studies with varied combinations of the numbers of variants of rare causal, rare non-causal, common causal, and common non-causal, deleterious and protective, various minor allele frequencies and different levels of linkage disequilibrium. The results show our methods have higher statistical power than conventional ones, including the likelihood based score test, in most cases: (1) there are multiple genetic variants in a gene or region; (2) both protective and deleterious variants are present; (3) there exist rare and common variants; and (4) more than half of the variants are neutral. The proposed tests are applied to the data from Collaborative Studies on Genetics of Alcoholism, and a competent performance is exhibited therein. Discussion. As a complementary to the existing methods mainly focusing on quantitative traits, this study provides the nonparametric tests MIT and aMIT for detecting variants associated with categorical trait. Furthermore, we plan to investigate the association between rare variants and multiple categorical traits.
PMCID:4918222
PMID: 27350900
ISSN: 2167-8359
CID: 5686542