Searched for: in-biosketch:yes
person:deustp01
Reactome and ORCID-fine-grained credit attribution for community curation
Viteri, Guilherme; Matthews, Lisa; Varusai, Thawfeek; Gillespie, Marc; Milacic, Marija; Cook, Justin; Weiser, Joel; Shorser, Solomon; Sidiropoulos, Konstantinos; Fabregat, Antonio; Haw, Robin; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning
Reactome is a manually curated, open-source, open-data knowledge base of biomolecular pathways. Reactome has always provided clear credit attribution for authors, curators and reviewers through fine-grained annotation of all three roles at the reaction and pathway level. These data are visible in the web interface and provided through the various data download formats. To enhance visibility and credit attribution for the work of authors, curators and reviewers, and to provide additional opportunities for Reactome community engagement, we have implemented key changes to Reactome: contributor names are now fully searchable in the web interface, and contributors can 'claim' their contributions to their ORCID profile with a few clicks. In addition, we are reaching out to domain experts to request their help in reviewing and editing Reactome pathways through a new 'Contribution' section, highlighting pathways which are awaiting community review. Database URL: https://reactome.org.
PMCID:6892999
PMID: 31802127
ISSN: 1758-0463
CID: 4249952
Integrative annotation and knowledge discovery of kinase post-translational modifications and cancer-associated mutations through federated protein ontologies and resources
Huang, Liang-Chin; Ross, Karen E; Baffi, Timothy R; Drabkin, Harold; Kochut, Krzysztof J; Ruan, Zheng; D'Eustachio, Peter; McSkimming, Daniel; Arighi, Cecilia; Chen, Chuming; Natale, Darren A; Smith, Cynthia; Gaudet, Pascale; Newton, Alexandra C; Wu, Cathy; Kannan, Natarajan
Many bioinformatics resources with unique perspectives on the protein landscape are currently available. However, generating new knowledge from these resources requires interoperable workflows that support cross-resource queries. In this study, we employ federated queries linking information from the Protein Kinase Ontology, iPTMnet, Protein Ontology, neXtProt, and the Mouse Genome Informatics to identify key knowledge gaps in the functional coverage of the human kinome and prioritize understudied kinases, cancer variants and post-translational modifications (PTMs) for functional studies. We identify 32 functional domains enriched in cancer variants and PTMs and generate mechanistic hypotheses on overlapping variant and PTM sites by aggregating information at the residue, protein, pathway and species level from these resources. We experimentally test the hypothesis that S768 phosphorylation in the C-helix of EGFR is inhibitory by showing that oncogenic variants altering S768 phosphorylation increase basal EGFR activity. In contrast, oncogenic variants altering conserved phosphorylation sites in the 'hydrophobic motif' of PKCβII (S660F and S660C) are loss-of-function in that they reduce kinase activity and enhance membrane translocation. Our studies provide a framework for integrative, consistent, and reproducible annotation of the cancer kinomes.
PMCID:5916945
PMID: 29695735
ISSN: 2045-2322
CID: 3052382
Reactome diagram viewer: Data structures and strategies to boost performance
Fabregat, Antonio; Sidiropoulos, Konstantinos; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning
Motivation: Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. For web-based pathway visualisation, Reactome uses a custom pathway diagram viewer that has been evolved over the past years. Here, we present comprehensive enhancements in usability and performance based on extensive usability testing sessions and technology developments, aiming to optimise the viewer towards the needs of the community. Results: The pathway diagram viewer version 3 achieves consistently better performance, loading and rendering of 97% of the diagrams in Reactome in less than 1 second. Combining the multi-layer html5 canvas strategy with a space partitioning data structure minimises CPU workload, enabling the introduction of new features that further enhance user experience. Through the use of highly optimised data structures and algorithms, Reactome has boosted the performance and usability of the new pathway diagram viewer, providing a robust, scalable and easy-to-integrate solution to pathway visualisation. As graph-based visualisation of complex data is a frequent challenge in bioinformatics, many of the individual strategies presented here are applicable to a wide range of web-based bioinformatics resources. Availability and Implementation: Reactome is available online at: https://reactome.org. The diagram viewer is part of the Reactome pathway browser (https://reactome.org/PathwayBrowser/) and also available as a stand-alone widget at: https://reactome.org/dev/diagram/. The source code is freely available at: https://github.com/reactome-pwp/diagram. Contact: hhe@ebi.ac.uk, fabregat@ebi.ac.uk. Supplementary information: An introductory video explaining the most relevant features of the Reactome pathway browser and the diagram viewer is available at https://youtu.be/-skixrvI4nU.
PMCID:6030826
PMID: 29186351
ISSN: 1367-4811
CID: 2798072
Interleukins and their signaling pathways in the Reactome biological pathway database
Jupe, Steven; Ray, Keith; Duenas Roca, Corina; Varusai, Thawfeek; Shamovsky, Veronica; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning
BACKGROUND:There is a wealth of biological pathway information available in the scientific literature but it is spread across many thousands of publications. Alongside publications that contain definitive experimental discoveries are many others that have been dismissed as spurious, or found to be irreproducible, or are contradicted by later results and consequently now considered controversial. Many descriptions and images of pathways are incomplete, stylized representations that assume the reader is an expert, familiar with the established details of the process, which are consequently not fully explained. Pathway representations in publications frequently do not represent a complete, detailed and unambiguous description of the molecules involved, their precise post-translational state, or a full account of the molecular events they undergo while participating in a process. While this might be sufficient to be interpreted by an expert reader, the lack of detail makes such pathways less useful and difficult to understand for anyone unfamiliar with the area and of limited use as the basis for computational models. OBJECTIVE:Reactome was established as a freely accessible knowledgebase of human biological pathways that is manually populated with interconnected molecular events that fully detail the molecular participants, linked to published experimental data and background material, using a formal, open data structure that facilitates computational reuse. This data is accessible on a website in the form of pathway diagrams that have descriptive summaries and annotations, and as downloadable datasets in several formats that can be reused with other computational tools. The entire database and all supporting software can be downloaded and reused under a Creative Commons licence. METHODS:Pathways are authored by expert biologists who work with Reactome curators and editorial staff to represent the consensus in the field. Pathways are represented as interactive diagrams that include as much molecular detail as possible, linked to literature citations that contain supporting experimental details. All newly created events undergo a peer-review process before they are added to the database and made available on the associated website. New content is added quarterly. RESULTS:The 63rd release of Reactome in December 2017 contains 10996 human proteins, participating in 11426 events in 2179 pathways. In addition, analysis tools allow dataset submission for the identification and visualization of pathway enrichment and representation of expression profiles as an overlay on Reactome pathways. Protein-protein and compound-protein interactions from several sources including custom user datasets can be added to extend pathways. Pathway diagrams and analysis result displays can be downloaded as editable images, human-readable reports and as files in several standard formats that are suitable for computational re-use. Reactome content is available programmatically via a REST-based content service and as a Neo4J graph database. Signaling pathways for Interleukins 1 to 38 are hierarchically classified within the pathway 'Signaling by Interleukins'. The classification used is largely derived from Akdis et al. (2016). CONCLUSION/CONCLUSIONS:The addition to Reactome of a complete set of the known human interleukins, their receptors and established signaling pathways, linked to annotations of relevant aspects of immune function, provides a significant computationally-accessible resource of information about this important family. This information can easily be extended as new discoveries become accepted as the consensus in the field. A key aim for the future is to increase coverage of gene expression changes induced by interleukin signaling.
PMCID:5927619
PMID: 29378288
ISSN: 1097-6825
CID: 2933722
The Reactome Pathway Knowledgebase
Fabregat, Antonio; Jupe, Steven; Matthews, Lisa; Sidiropoulos, Konstantinos; Gillespie, Marc; Garapati, Phani; Haw, Robin; Jassal, Bijay; Korninger, Florian; May, Bruce; Milacic, Marija; Roca, Corina Duenas; Rothfels, Karen; Sevilla, Cristoffer; Shamovsky, Veronica; Shorser, Solomon; Varusai, Thawfeek; Viteri, Guilherme; Weiser, Joel; Wu, Guanming; Stein, Lincoln; Hermjakob, Henning; D'Eustachio, Peter
The Reactome Knowledgebase (https://reactome.org) provides molecular details of signal transduction, transport, DNA replication, metabolism, and other cellular processes as an ordered network of molecular transformations-an extended version of a classic metabolic map, in a single consistent data model. Reactome functions both as an archive of biological processes and as a tool for discovering unexpected functional relationships in data such as gene expression profiles or somatic mutation catalogues from tumor cells. To support the continued brisk growth in the size and complexity of Reactome, we have implemented a graph database, improved performance of data analysis tools, and designed new data structures and strategies to boost diagram viewer performance. To make our website more accessible to human users, we have improved pathway display and navigation by implementing interactive Enhanced High Level Diagrams (EHLDs) with an associated icon library, and subpathway highlighting and zooming, in a simplified and reorganized web site with adaptive design. To encourage re-use of our content, we have enabled export of pathway diagrams as 'PowerPoint' files.
PMCID:5753187
PMID: 29145629
ISSN: 1362-4962
CID: 2785172
Gramene 2018: unifying comparative genomics and pathway resources for plant research
Tello-Ruiz, Marcela K; Naithani, Sushma; Stein, Joshua C; Gupta, Parul; Campbell, Michael; Olson, Andrew; Wei, Sharon; Preece, Justin; Geniza, Matthew J; Jiao, Yinping; Lee, Young Koung; Wang, Bo; Mulvaney, Joseph; Chougule, Kapeel; Elser, Justin; Al-Bader, Noor; Kumari, Sunita; Thomason, James; Kumar, Vivek; Bolser, Daniel M; Naamati, Guy; Tapanari, Electra; Fonseca, Nuno; Huerta, Laura; Iqbal, Haider; Keays, Maria; Munoz-Pomer Fuentes, Alfonso; Tang, Amy; Fabregat, Antonio; D'Eustachio, Peter; Weiser, Joel; Stein, Lincoln D; Petryszak, Robert; Papatheodorou, Irene; Kersey, Paul J; Lockhart, Patti; Taylor, Crispin; Jaiswal, Pankaj; Ware, Doreen
Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.
PMCID:5753211
PMID: 29165610
ISSN: 1362-4962
CID: 2792292
Reactome graph database: Efficient access to complex pathway data
Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
PMCID:5805351
PMID: 29377902
ISSN: 1553-7358
CID: 2933692
Reactome enhanced pathway visualization
Sidiropoulos, Konstantinos; Viteri, Guilherme; Sevilla, Cristoffer; Jupe, Steve; Webber, Marissa; Orlic-Milacic, Marija; Jassal, Bijay; May, Bruce; Shamovsky, Veronica; Duenas, Corina; Rothfels, Karen; Matthews, Lisa; Song, Heeyeon; Stein, Lincoln; Haw, Robin; D'Eustachio, Peter; Ping, Peipei; Hermjakob, Henning; Fabregat, Antonio
Motivation: Reactome is a free, open-source, open-data, curated and peer-reviewed knowledge base of biomolecular pathways. Pathways are arranged in a hierarchical structure that largely corresponds to the GO biological process hierarchy, allowing the user to navigate from high level concepts like immune system to detailed pathway diagrams showing biomolecular events like membrane transport or phosphorylation. Here, we present new developments in the Reactome visualization system that facilitate navigation through the pathway hierarchy and enable efficient reuse of Reactome visualizations for users' own research presentations and publications. Results: For the higher levels of the hierarchy, Reactome now provides scalable, interactive textbook-style diagrams in SVG format, which are also freely downloadable and editable. Repeated diagram elements like 'mitochondrion' or 'receptor' are available as a library of graphic elements. Detailed lower-level diagrams are now downloadable in editable PPTX format as sets of interconnected objects. Availability and implementation: http://reactome.org. Contact: fabregat@ebi.ac.uk or hhe@ebi.ac.uk.
PMCID:5860170
PMID: 29077811
ISSN: 1367-4811
CID: 2757212
Reactome pathway analysis: a high-performance in-memory approach
Fabregat, Antonio; Sidiropoulos, Konstantinos; Viteri, Guilherme; Forner, Oscar; Marin-Garcia, Pablo; Arnau, Vicente; D'Eustachio, Peter; Stein, Lincoln; Hermjakob, Henning
BACKGROUND: Reactome aims to provide bioinformatics tools for visualisation, interpretation and analysis of pathway knowledge to support basic research, genome analysis, modelling, systems biology and education. Pathway analysis methods have a broad range of applications in physiological and biomedical research; one of the main problems, from the analysis methods performance point of view, is the constantly increasing size of the data samples. RESULTS: Here, we present a new high-performance in-memory implementation of the well-established over-representation analysis method. To achieve the target, the over-representation analysis method is divided in four different steps and, for each of them, specific data structures are used to improve performance and minimise the memory footprint. The first step, finding out whether an identifier in the user's sample corresponds to an entity in Reactome, is addressed using a radix tree as a lookup table. The second step, modelling the proteins, chemicals, their orthologous in other species and their composition in complexes and sets, is addressed with a graph. The third and fourth steps, that aggregate the results and calculate the statistics, are solved with a double-linked tree. CONCLUSION: Through the use of highly optimised, in-memory data structures and algorithms, Reactome has achieved a stable, high performance pathway analysis service, enabling the analysis of genome-wide datasets within seconds, allowing interactive exploration and analysis of high throughput data. The proposed pathway analysis approach is available in the Reactome production web site either via the AnalysisService for programmatic access or the user submission interface integrated into the PathwayBrowser. Reactome is an open data and open source project and all of its source code, including the one described here, is available in the AnalysisTools repository in the Reactome GitHub ( https://github.com/reactome/ ).
PMCID:5333408
PMID: 28249561
ISSN: 1471-2105
CID: 2471172
Plant Reactome: a resource for plant pathways and comparative analysis
Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D; Wu, Guanming; Fabregat, Antonio; Elser, Justin L; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D; Ware, Doreen; Jaiswal, Pankaj
Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX.
PMCID:5210633
PMID: 27799469
ISSN: 1362-4962
CID: 2297172