Try a new search

Format these results:

Searched for:

person:huj08

in-biosketch:yes

Total Results:

35


A maximum-type microbial differential abundance test with application to high-dimensional microbiome data analyses

Li, Zhengbang; Yu, Xiaochen; Guo, Hongping; Lee, TingFang; Hu, Jiyuan
BACKGROUND:High-throughput metagenomic sequencing technologies have shown prominent advantages over traditional pathogen detection methods, bringing great potential in clinical pathogen diagnosis and treatment of infectious diseases. Nevertheless, how to accurately detect the difference in microbiome profiles between treatment or disease conditions remains computationally challenging. RESULTS:In this study, we propose a novel test for identifying the difference between two high-dimensional microbiome abundance data matrices based on the centered log-ratio transformation of the microbiome compositions. The test p-value can be calculated directly with a closed-form solution from the derived asymptotic null distribution. We also investigate the asymptotic statistical power against sparse alternatives that are typically encountered in microbiome studies. The proposed test is maximum-type equal-covariance-assumption-free (MECAF), making it widely applicable to studies that compare microbiome compositions between conditions. Our simulation studies demonstrated that the proposed MECAF test achieves more desirable power than competing methods while having the type I error rate well controlled under various scenarios. The usefulness of the proposed test is further illustrated with two real microbiome data analyses. The source code of the proposed method is freely available at https://github.com/Jiyuan-NYU-Langone/MECAF. CONCLUSIONS:MECAF is a flexible differential abundance test and achieves statistical efficiency in analyzing high-throughput microbiome data. The proposed new method will allow us to efficiently discover shifts in microbiome abundances between disease and treatment conditions, broadening our understanding of the disease and ultimately improving clinical diagnosis and treatment.
PMCID:9650337
PMID: 36389165
ISSN: 2235-2988
CID: 5371642

Joint modeling of zero-inflated longitudinal proportions and time-to-event data with application to a gut microbiome study

Hu, Jiyuan; Wang, Chan; Blaser, Martin J; Li, Huilin
Recent studies have suggested that the temporal dynamics of the human microbiome may have associations with human health and disease. An increasing number of longitudinal microbiome studies, which record time to disease onset, aim to identify candidate microbes as biomarkers for prognosis. Owing to the ultra-skewness and sparsity of microbiome proportion (relative abundance) data, directly applying traditional statistical methods may result in substantial power loss or spurious inferences. We propose a novel joint modeling framework [JointMM], which is comprised of two sub-models: a longitudinal sub-model called zero-inflated scaled-Beta generalized linear mixed-effects regression to depict the temporal structure of microbial proportions among subjects; and a survival sub-model to characterize the occurrence of an event and its relationship with the longitudinal microbiome proportions. JointMM is specifically designed to handle the zero-inflated and highly skewed longitudinal microbial proportion data and examine whether the temporal pattern of microbial presence and/or the non-zero microbial proportions are associated with differences in the time to an event. The longitudinal sub-model of JointMM also provides the capacity to investigate how the (time-varying) covariates are related to the temporal microbial presence/absence patterns and/or the changing trend in non-zero proportions. Comprehensive simulations and real data analyses are used to assess the statistical efficiency and interpretability of JointMM. This article is protected by copyright. All rights reserved.
PMID: 34213763
ISSN: 1541-0420
CID: 4950332

Reducing Ophthalmic Health Disparities Through Transfer Learning: A Novel Application to Overcome Data Inequality

Lee, TingFang; Wollstein, Gadi; Madu, Chisom T; Wronka, Andrew; Zheng, Lei; Zambrano, Ronald; Schuman, Joel S; Hu, Jiyuan
PURPOSE/UNASSIGNED:Race disparities in the healthcare system and the resulting inequality in clinical data among different races hinder the ability to generate equitable prediction results. This study aims to reduce healthcare disparities arising from data imbalance by leveraging advanced transfer learning (TL) methods. METHOD/UNASSIGNED:We examined the ophthalmic healthcare disparities at a population level using electronic medical records data from a study cohort (N = 785) receiving care at an academic institute. Regression-based TL models were usesd, transferring valuable information from the dominant racial group (White) to improve visual field mean deviation (MD) rate of change prediction particularly for data-disadvantaged African American (AA) and Asian racial groups. Prediction results of TL models were compared with two conventional approaches. RESULTS/UNASSIGNED:Disparities in socioeconomic status and baseline disease severity were observed among the AA and Asian racial groups. The TL approach achieved marked to comparable improvement in prediction accuracy compared to the two conventional approaches as evident by smaller mean absolute errors or mean square errors. TL identified distinct key features of visual field MD rate of change for each racial group. CONCLUSIONS/UNASSIGNED:The study introduces a novel application of TL that improved reliability of the analysis in comparison with conventional methods, especially in small sample size groups. This can improve assessment of healthcare disparity and subsequent remedy approach. TRANSLATIONAL RELEVANCE/UNASSIGNED:TL offers an equitable and efficient approach to mitigate healthcare disparities analysis by enhancing prediction performance for data-disadvantaged group.
PMCID:10697175
PMID: 38038606
ISSN: 2164-2591
CID: 5589882

LIMBARE: An Advanced Linear Mixed-Effects Breakpoint Analysis With Robust Estimation Method With Applications to Longitudinal Ophthalmic Studies

Lee, TingFang; Schuman, Joel S; Ramos Cadena, Maria de Los Angeles; Zhang, Yan; Wollstein, Gadi; Hu, Jiyuan
PURPOSE/UNASSIGNED:Broken stick analysis is a widely used approach for detecting unknown breakpoints where the association between measurements is nonlinear. We propose LIMBARE, an advanced linear mixed-effects breakpoint analysis with robust estimation, especially designed for longitudinal ophthalmic studies. LIMBARE accommodates repeated measurements from both eyes and over time, and it effectively addresses the presence of outliers. METHODS/UNASSIGNED:The model setup of LIMBARE and the computing algorithm for point and confidence interval estimates of the breakpoint were introduced. The performance of LIMBARE and other competing methods was assessed via comprehensive simulation studies and application to a longitudinal ophthalmic study with 216 eyes (145 subjects) followed for an average of 3.7 ± 1.3 years to examine the longitudinal association between structural and functional measurements. RESULTS/UNASSIGNED:In simulation studies, LIMBARE showed the smallest bias and mean squared error for estimating the breakpoint, with an empirical coverage probability of corresponding confidence interval estimates closest to the nominal level for scenarios with and without outlier data points. In the application to the longitudinal ophthalmic study, LIMBARE detected two breakpoints between visual field mean deviation (MD) and retinal nerve fiber layer thickness and one breakpoint between MD and cup-to-disc ratio, whereas the cross-sectional analysis approach detected only one and none, respectively. CONCLUSIONS/UNASSIGNED:LIMBARE enhances breakpoint estimation accuracy in longitudinal ophthalmic studies, and the cross-sectional analysis approach is not recommended for future studies. TRANSLATIONAL RELEVANCE/UNASSIGNED:Our proposed method and companion R package provide a valuable computational tool for advancing longitudinal ophthalmology research and exploring the association relationships among ophthalmic variables.
PMCID:10807490
PMID: 38241038
ISSN: 2164-2591
CID: 5624452

Efficient estimation of disease odds ratios for follow-up genetic association studies

Hu, Jiyuan; Zhang, Wei; Li, Xinmin; Pan, Dongdong; Li, Qizhai
In the past decade, genome-wide association studies have identified thousands of susceptible variants associated with complex human diseases and traits. Conducting follow-up genetic association studies has become a standard approach to validate the findings of genome-wide association studies. One problem of high interest in genetic association studies is to accurately estimate the strength of the association, which is often quantified by odds ratios in case-control studies. However, estimating the association directly by follow-up studies is inefficient since this approach ignores information from the genome-wide association studies. In this article, an estimator called GFcom, which integrates information from genome-wide association studies and follow-up studies, is proposed. The estimator includes both the point estimate and corresponding confidence interval. GFcom is more efficient than competing estimators regarding MSE and the length of confidence intervals. The superiority of GFcom is particularly evident when the genome-wide association study suffers from severe selection bias. Comprehensive simulation studies and applications to three real follow-up studies demonstrate the performance of the proposed estimator. An R package, "GFcom", implementing our method is publicly available at https://github.com/JiyuanHu/GFcom .
PMID: 29157118
ISSN: 1477-0334
CID: 4534842

Human Aldose Reductase Expression Prevents Atherosclerosis Regression in Diabetic Mice

Yuan, Chujun; Hu, Jiyuan; Parathath, Saj; Grauer, Lisa; Cassella, Courtney Blachford; Bagdasarov, Svetlana; Goldberg, Ira J; Ramasamy, Ravichandran; Fisher, Edward A
Guidelines to reduce cardiovascular risk in diabetes include aggressive LDL lowering, but benefits are attenuated compared to those in patients without diabetes. Consistent with this, we have reported in mice that hyperglycemia impaired atherosclerosis regression. Aldose reductase (AR) is thought to contribute to clinical complications of diabetes by directing glucose into pathways producing inflammatory metabolites. Mice have low levels of AR, thus, raising them to human levels would be a more clinically relevant model to study changes in diabetes under atherosclerosis regression conditions. Donor aortae from western diet-fed Ldlr
PMCID:6110315
PMID: 29891593
ISSN: 1939-327x
CID: 3155152

A two-stage microbial association mapping framework with advanced FDR control

Hu, Jiyuan; Koh, Hyunwook; He, Linchen; Liu, Menghan; Blaser, Martin J; Li, Huilin
BACKGROUND:In microbiome studies, it is important to detect taxa which are associated with pathological outcomes at the lowest definable taxonomic rank, such as genus or species. Traditionally, taxa at the target rank are tested for individual association, followed by the Benjamini-Hochberg (BH) procedure to control for false discovery rate (FDR). However, this approach neglects the dependence structure among taxa and may lead to conservative results. The taxonomic tree of microbiome data represents alignment from phylum to species rank and characterizes evolutionary relationships across microbial taxa. Taxa that are closer on the tree usually have similar responses to the exposure (environment). The statistical power in microbial association tests can be enhanced by efficiently employing the prior evolutionary information via the taxonomic tree. METHODS:We propose a two-stage microbial association mapping framework (massMap) which uses grouping information from the taxonomic tree to strengthen statistical power in association tests at the target rank. massMap first screens the association of taxonomic groups at a pre-selected higher taxonomic rank using a powerful microbial group test OMiAT. The method then proceeds to test the association for each candidate taxon at the target rank within the significant taxonomic groups identified in the first stage. Hierarchical BH (HBH) and selected subset testing (SST) procedures are evaluated to control the FDR for the two-stage structured tests. RESULTS:Our simulations show that massMap incorporating OMiAT and the advanced FDR controlling methodologies largely alleviates the multiplicity issue. It is statistically more powerful than the traditional association mapping directly at the target rank while controlling the FDR at desired levels under most scenarios. In our real data analyses, massMap detects more or the same amount of associated species with smaller adjusted p values compared to the traditional method, which further illustrates the efficiency of the proposed framework. The R package of massMap is publicly available at https://sites.google.com/site/huilinli09/software and https://github.com/JiyuanHu/ . CONCLUSIONS:massMap is a novel microbial association mapping framework and achieves additional efficiency by utilizing the intrinsic taxonomic structure of microbiome data.
PMCID:6060480
PMID: 30045760
ISSN: 2049-2618
CID: 3206642

contamDE: differential expression analysis of RNA-seq data for contaminated tumor samples

Shen, Qi; Hu, Jiyuan; Jiang, Ning; Hu, Xiaohua; Luo, Zewei; Zhang, Hong
MOTIVATION:Accurate detection of differentially expressed genes between tumor and normal samples is a primary approach of cancer-related biomarker identification. Due to the infiltration of tumor surrounding normal cells, the expression data derived from tumor samples would always be contaminated with normal cells. Ignoring such cellular contamination would deflate the power of detecting DE genes and further confound the biological interpretation of the analysis results. For the time being, there does not exists any differential expression analysis approach for RNA-seq data in literature that can properly account for the contamination of tumor samples. RESULTS:Without appealing to any extra information, we develop a new method 'contamDE' based on a novel statistical model that associates RNA-seq expression levels with cell types. It is demonstrated through simulation studies that contamDE could be much more powerful than the existing methods that ignore the contamination. In the application to two cancer studies, contamDE uniquely found several potential therapy and prognostic biomarkers of prostate cancer and non-small cell lung cancer. AVAILABILITY AND IMPLEMENTATION:An R package contamDE is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/ CONTACT:zhanghfd@fudan.edu.cn SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
PMID: 26556386
ISSN: 1367-4811
CID: 4534822

Flow cytometric assessment of leukemia-associated monocytes in childhood B-cell acute lymphoblastic leukemia outcome

Contreras Yametti, Gloria Paz; Evensen, Nikki A; Schloss, Jennifer; Aldebert, Clemence; Duan, Emily; Zhang, Yan; Hu, Jiyuan; Chambers, Tiffany M; Scheurer, Michael E; Teachey, David T; Rabin, Karen R; Raetz, Elizabeth A; Aifantis, Iannis; Carroll, William L; Witkowski, Matthew T
PMID: 37196626
ISSN: 2473-9537
CID: 5505192

Portable Air Cleaners and Home Systolic Blood Pressure in Adults With Hypertension Living in New York City Public Housing [Letter]

Wittkopp, Sharine; Anastasiou, Elle; Hu, Jiyuan; Liu, Mengling; Langford, Aisha T; Brook, Robert D; Gordon, Terry; Thorpe, Lorna E; Newman, Jonathan D
PMCID:10356071
PMID: 37382099
ISSN: 2047-9980
CID: 5537272