Searched for: in-biosketch:yes
person:stolog01
A Community Challenge for Inferring Genetic Predictors of Gene Essentialities through Analysis of a Functional Screen of Cancer Cell Lines
Gönen, Mehmet; Weir, Barbara A; Cowley, Glenn S; Vazquez, Francisca; Guan, Yuanfang; Jaiswal, Alok; Karasuyama, Masayuki; Uzunangelov, Vladislav; Wang, Tao; Tsherniak, Aviad; Howell, Sara; Marbach, Daniel; Hoff, Bruce; Norman, Thea C; Airola, Antti; Bivol, Adrian; Bunte, Kerstin; Carlin, Daniel; Chopra, Sahil; Deran, Alden; Ellrott, Kyle; Gopalacharyulu, Peddinti; Graim, Kiley; Kaski, Samuel; Khan, Suleiman A; Newton, Yulia; Ng, Sam; Pahikkala, Tapio; Paull, Evan; Sokolov, Artem; Tang, Hao; Tang, Jing; Wennerberg, Krister; Xie, Yang; Zhan, Xiaowei; Zhu, Fan; ,; Aittokallio, Tero; Mamitsuka, Hiroshi; Stuart, Joshua M; Boehm, Jesse S; Root, David E; Xiao, Guanghua; Stolovitzky, Gustavo; Hahn, William C; Margolin, Adam A
We report the results of a DREAM challenge designed to predict relative genetic essentialities based on a novel dataset testing 98,000 shRNAs against 149 molecularly characterized cancer cell lines. We analyzed the results of over 3,000 submissions over a period of 4 months. We found that algorithms combining essentiality data across multiple genes demonstrated increased accuracy; gene expression was the most informative molecular data type; the identity of the gene being predicted was far more important than the modeling strategy; well-predicted genes and selected molecular features showed enrichment in functional categories; and frequently selected expression features correlated with survival in primary tumors. This study establishes benchmarks for gene essentiality prediction, presents a community resource for future comparison with this benchmark, and provides insights into factors influencing the ability to predict gene essentiality from functional genetic screens. This study also demonstrates the value of releasing pre-publication data publicly to engage the community in an open research collaboration.
PMCID:5814247
PMID: 28988802
ISSN: 2405-4712
CID: 5822572
A DREAM Challenge to Build Prediction Models for Short-Term Discontinuation of Docetaxel in Metastatic Castration-Resistant Prostate Cancer
Seyednasrollah, Fatemeh; Koestler, Devin C; Wang, Tao; Piccolo, Stephen R; Vega, Roberto; Greiner, Russell; Fuchs, Christiane; Gofer, Eyal; Kumar, Luke; Wolfinger, Russell D; Kanigel Winner, Kimberly; Bare, Chris; Neto, Elias Chaibub; Yu, Thomas; Shen, Liji; Abdallah, Kald; Norman, Thea; Stolovitzky, Gustavo; Soule, Howard R; Sweeney, Christopher J; Ryan, Charles J; Scher, Howard I; Sartor, Oliver; Elo, Laura L; Zhou, Fang Liz; Guinney, Justin; Costello, James C; ,
PURPOSE:Docetaxel has a demonstrated survival benefit for patients with metastatic castration-resistant prostate cancer (mCRPC); however, 10% to 20% of patients discontinue docetaxel prematurely because of toxicity-induced adverse events, and the management of risk factors for toxicity remains a challenge. PATIENTS AND METHODS:The comparator arms of four phase III clinical trials in first-line mCRPC were collected, annotated, and compiled, with a total of 2,070 patients. Early discontinuation was defined as treatment stoppage within 3 months as a result of adverse treatment effects; 10% of patients discontinued treatment. We designed an open-data, crowd-sourced DREAM Challenge for developing models with which to predict early discontinuation of docetaxel treatment. Clinical features for all four trials and outcomes for three of the four trials were made publicly available, with the outcomes of the fourth trial held back for unbiased model evaluation. Challenge participants from around the world trained models and submitted their predictions. Area under the precision-recall curve was the primary metric used for performance assessment. RESULTS:In total, 34 separate teams submitted predictions. Seven models with statistically similar area under precision-recall curves (Bayes factor ≤ 3) outperformed all other models. A postchallenge analysis of risk prediction using these seven models revealed three patient subgroups: high risk, low risk, or discordant risk. Early discontinuation events were two times higher in the high-risk subgroup compared with the low-risk subgroup. Simulation studies demonstrated that use of patient discontinuation prediction models could reduce patient enrollment in clinical trials without the loss of statistical power. CONCLUSION:This work represents a successful collaboration between 34 international teams that leveraged open clinical trial data. Our results demonstrate that routinely collected clinical features can be used to identify patients with mCRPC who are likely to discontinue treatment because of adverse events and establishes a robust benchmark with implications for clinical trial design.
PMCID:6874023
PMID: 30657384
ISSN: 2473-4276
CID: 5822622
The inconvenience of data of convenience: computational research beyond post-mortem analyses [Letter]
Azencott, Chloé-Agathe; Aittokallio, Tero; Roy, Sushmita; ,; Norman, Thea; Friend, Stephen; Stolovitzky, Gustavo; Goldenberg, Anna
PMID: 28960198
ISSN: 1548-7105
CID: 5822562
Broken flow symmetry explains the dynamics of small particles in deterministic lateral displacement arrays
Kim, Sung-Cheol; Wunsch, Benjamin H; Hu, Huan; Smith, Joshua T; Austin, Robert H; Stolovitzky, Gustavo
Deterministic lateral displacement (DLD) is a technique for size fractionation of particles in continuous flow that has shown great potential for biological applications. Several theoretical models have been proposed, but experimental evidence has demonstrated that a rich class of intermediate migration behavior exists, which is not predicted. We present a unified theoretical framework to infer the path of particles in the whole array on the basis of trajectories in a unit cell. This framework explains many of the unexpected particle trajectories reported and can be used to design arrays for even nanoscale particle fractionation. We performed experiments that verify these predictions and used our model to develop a condenser array that achieves full particle separation with a single fluidic input.
PMCID:5495280
PMID: 28607075
ISSN: 1091-6490
CID: 5822552
Predicting human olfactory perception from chemical features of odor molecules
Keller, Andreas; Gerkin, Richard C; Guan, Yuanfang; Dhurandhar, Amit; Turu, Gabor; Szalai, Bence; Mainland, Joel D; Ihara, Yusuke; Yu, Chung Wen; Wolfinger, Russ; Vens, Celine; Schietgat, Leander; De Grave, Kurt; Norel, Raquel; ,; Stolovitzky, Gustavo; Cecchi, Guillermo A; Vosshall, Leslie B; Meyer, Pablo
It is still not possible to predict whether a given molecule will have a perceived odor or what olfactory percept it will produce. We therefore organized the crowd-sourced DREAM Olfaction Prediction Challenge. Using a large olfactory psychophysical data set, teams developed machine-learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features. The resulting models accurately predicted odor intensity and pleasantness and also successfully predicted 8 among 19 rated semantic descriptors ("garlic," "fish," "sweet," "fruit," "burnt," "spices," "flower," and "sour"). Regularized linear models performed nearly as well as random forest-based ones, with a predictive accuracy that closely approaches a key theoretical limit. These models help to predict the perceptual qualities of virtually any molecule with high accuracy and also reverse-engineer the smell of a molecule.
PMCID:5455768
PMID: 28219971
ISSN: 1095-9203
CID: 5822542
Wafer-scale integration of sacrificial nanofluidic chips for detecting and manipulating single DNA molecules
Wang, Chao; Nam, Sung-Wook; Cotte, John M; Jahnes, Christopher V; Colgan, Evan G; Bruce, Robert L; Brink, Markus; Lofaro, Michael F; Patel, Jyotica V; Gignac, Lynne M; Joseph, Eric A; Rao, Satyavolu Papa; Stolovitzky, Gustavo; Polonsky, Stanislav; Lin, Qinghuang
Wafer-scale fabrication of complex nanofluidic systems with integrated electronics is essential to realizing ubiquitous, compact, reliable, high-sensitivity and low-cost biomolecular sensors. Here we report a scalable fabrication strategy capable of producing nanofluidic chips with complex designs and down to single-digit nanometre dimensions over 200 mm wafer scale. Compatible with semiconductor industry standard complementary metal-oxide semiconductor logic circuit fabrication processes, this strategy extracts a patterned sacrificial silicon layer through hundreds of millions of nanoscale vent holes on each chip by gas-phase Xenon difluoride etching. Using single-molecule fluorescence imaging, we demonstrate these sacrificial nanofluidic chips can function to controllably and completely stretch lambda DNA in a two-dimensional nanofluidic network comprising channels and pillars. The flexible nanofluidic structure design, wafer-scale fabrication, single-digit nanometre channels, reliable fluidic sealing and low thermal budget make our strategy a potentially universal approach to integrating functional planar nanofluidic systems with logic circuits for lab-on-a-chip applications.
PMCID:5264239
PMID: 28112157
ISSN: 2041-1723
CID: 5822532
Prediction of overall survival for patients with metastatic castration-resistant prostate cancer: development of a prognostic model through a crowdsourced challenge with open clinical trial data
Guinney, Justin; Wang, Tao; Laajala, Teemu D; Winner, Kimberly Kanigel; Bare, J Christopher; Neto, Elias Chaibub; Khan, Suleiman A; Peddinti, Gopal; Airola, Antti; Pahikkala, Tapio; Mirtti, Tuomas; Yu, Thomas; Bot, Brian M; Shen, Liji; Abdallah, Kald; Norman, Thea; Friend, Stephen; Stolovitzky, Gustavo; Soule, Howard; Sweeney, Christopher J; Ryan, Charles J; Scher, Howard I; Sartor, Oliver; Xie, Yang; Aittokallio, Tero; Zhou, Fang Liz; Costello, James C; ,
BACKGROUND:Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. METHODS:Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. FINDINGS:50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0·791; Bayes factor >5) and surpassed the reference model (iAUC 0·743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3·32, 95% CI 2·39-4·62, p<0·0001; reference model: 2·56, 1·85-3·53, p<0·0001). The new model was validated further on the ENTHUSE M1 cohort with similarly high performance (iAUC 0·768). Meta-analysis across all methods confirmed previously identified predictive clinical variables and revealed aspartate aminotransferase as an important, albeit previously under-reported, prognostic biomarker. INTERPRETATION:Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer. FUNDING:Sanofi US Services, Project Data Sphere.
PMID: 27864015
ISSN: 1474-5488
CID: 5822522
Nanoscale lateral displacement arrays for the separation of exosomes and colloids down to 20 nm
Wunsch, Benjamin H; Smith, Joshua T; Gifford, Stacey M; Wang, Chao; Brink, Markus; Bruce, Robert L; Austin, Robert H; Stolovitzky, Gustavo; Astier, Yann
Deterministic lateral displacement (DLD) pillar arrays are an efficient technology to sort, separate and enrich micrometre-scale particles, which include parasites, bacteria, blood cells and circulating tumour cells in blood. However, this technology has not been translated to the true nanoscale, where it could function on biocolloids, such as exosomes. Exosomes, a key target of 'liquid biopsies', are secreted by cells and contain nucleic acid and protein information about their originating tissue. One challenge in the study of exosome biology is to sort exosomes by size and surface markers. We use manufacturable silicon processes to produce nanoscale DLD (nano-DLD) arrays of uniform gap sizes ranging from 25 to 235 nm. We show that at low Péclet (Pe) numbers, at which diffusion and deterministic displacement compete, nano-DLD arrays separate particles between 20 to 110 nm based on size with sharp resolution. Further, we demonstrate the size-based displacement of exosomes, and so open up the potential for on-chip sorting and quantification of these important biocolloids.
PMID: 27479757
ISSN: 1748-3395
CID: 5822512
Crowdsourced assessment of common genetic contribution to predicting anti-TNF treatment response in rheumatoid arthritis
Sieberts, Solveig K; Zhu, Fan; García-García, Javier; Stahl, Eli; Pratap, Abhishek; Pandey, Gaurav; Pappas, Dimitrios; Aguilar, Daniel; Anton, Bernat; Bonet, Jaume; Eksi, Ridvan; Fornés, Oriol; Guney, Emre; Li, Hongdong; MarÃn, Manuel Alejandro; Panwar, Bharat; Planas-Iglesias, Joan; Poglayen, Daniel; Cui, Jing; Falcao, Andre O; Suver, Christine; Hoff, Bruce; Balagurusamy, Venkat S K; Dillenberger, Donna; Neto, Elias Chaibub; Norman, Thea; Aittokallio, Tero; Ammad-Ud-Din, Muhammad; Azencott, Chloe-Agathe; Bellón, VÃctor; Boeva, Valentina; Bunte, Kerstin; Chheda, Himanshu; Cheng, Lu; Corander, Jukka; Dumontier, Michel; Goldenberg, Anna; Gopalacharyulu, Peddinti; Hajiloo, Mohsen; Hidru, Daniel; Jaiswal, Alok; Kaski, Samuel; Khalfaoui, Beyrem; Khan, Suleiman Ali; Kramer, Eric R; Marttinen, Pekka; Mezlini, Aziz M; Molparia, Bhuvan; Pirinen, Matti; Saarela, Janna; Samwald, Matthias; Stoven, Véronique; Tang, Hao; Tang, Jing; Torkamani, Ali; Vert, Jean-Phillipe; Wang, Bo; Wang, Tao; Wennerberg, Krister; Wineinger, Nathan E; Xiao, Guanghua; Xie, Yang; Yeung, Rae; Zhan, Xiaowei; Zhao, Cheng; Greenberg, Jeff; Kremer, Joel; Michaud, Kaleb; Barton, Anne; Coenen, Marieke; Mariette, Xavier; Miceli, Corinne; Shadick, Nancy; Weinblatt, Michael; de Vries, Niek; Tak, Paul P; Gerlag, Danielle; Huizinga, Tom W J; Kurreeman, Fina; Allaart, Cornelia F; Louis Bridges, S; Criswell, Lindsey; Moreland, Larry; Klareskog, Lars; Saevarsdottir, Saedis; Padyukov, Leonid; Gregersen, Peter K; Friend, Stephen; Plenge, Robert; Stolovitzky, Gustavo; Oliva, Baldo; Guan, Yuanfang; Mangravite, Lara M
Rheumatoid arthritis (RA) affects millions world-wide. While anti-TNF treatment is widely used to reduce disease progression, treatment fails in ∼one-third of patients. No biomarker currently exists that identifies non-responders before treatment. A rigorous community-based assessment of the utility of SNP data for predicting anti-TNF treatment efficacy in RA patients was performed in the context of a DREAM Challenge (http://www.synapse.org/RA_Challenge). An open challenge framework enabled the comparative evaluation of predictions developed by 73 research groups using the most comprehensive available data and covering a wide range of state-of-the-art modelling methodologies. Despite a significant genetic heritability estimate of treatment non-response trait (h(2)=0.18, P value=0.02), no significant genetic contribution to prediction accuracy is observed. Results formally confirm the expectations of the rheumatology community that SNP information does not significantly improve predictive performance relative to standard clinical traits, thereby justifying a refocusing of future efforts on collection of other data.
PMID: 27549343
ISSN: 2041-1723
CID: 4178352
Crowdsourcing biomedical research: leveraging communities as innovation engines
Saez-Rodriguez, Julio; Costello, James C; Friend, Stephen H; Kellen, Michael R; Mangravite, Lara; Meyer, Pablo; Norman, Thea; Stolovitzky, Gustavo
The generation of large-scale biomedical data is creating unprecedented opportunities for basic and translational science. Typically, the data producers perform initial analyses, but it is very likely that the most informative methods may reside with other groups. Crowdsourcing the analysis of complex and massive data has emerged as a framework to find robust methodologies. When the crowdsourcing is done in the form of collaborative scientific competitions, known as Challenges, the validation of the methods is inherently addressed. Challenges also encourage open innovation, create collaborative communities to solve diverse and important biomedical problems, and foster the creation and dissemination of well-curated data repositories.
PMCID:5918684
PMID: 27418159
ISSN: 1471-0064
CID: 5822502