The Candida Biochemical Pathways were created using the Pathway Tools
software, which is developed and
maintained by Peter Karp and his colleagues
at the Bioinformatics Research
Group at SRI International. While Pathway Tools supports curation
of additional types of data, including general gene product
information, small molecule transport reactions, enzyme kinetics, and
some (prokaryotic-focused) gene regulatory information, CGD is using
the software to create and curate biochemical pathways only. Therefore,
some features of the Pathway Tools software will not be fully
functional in CGD, and not all types of data will
Initially, the CGD biochemical pathways were automatically
generated using PathoLogic, a pathway prediction program built into
the Pathway Tools. PathoLogic
used information about the function of individual Candida gene
products from CGD in conjunction with SRI's reference database of biochemical reaction and
pathway information, MetaCyc, to create a set of
predicted Candida albicans pathways. The starting set of Candida albicans
enzymes that was input into the PathoLogic software was generated
using the Gene Ontology curation from CGD. PathoLogic then
compared the list of enzyme names against SRI's MetaCyc pathway
database. For the purpose of generation of pathways for CGD, the
software was also configured to consult the set of pathways curated at
the Saccharomyces Genome database in addition to the pathways
contained in MetaCyc
pathway, such that curated S. cerevisiae pathways that are not
included in MetaCyc were used as an additional basis for comparison. If Candida albicans contains one or more enzymes that
participate in a pathway that exists in MetaCyc or SGD, the software predicted
that a similar pathway exists in Candida albicans. Many of the
pathways that were predicted have candidate Candida albicans enzyme assignments
for a subset, but not all, of the reactions that comprise the
pathway. The Pathway Hole Filler program that is built into Pathway
Tools was used to identify genes encoding candidate enzymes for the other
reactions (the "pathway holes"). To do so, the Pathway Hole Filler was configured to use
sequence-based information only (not other possible data types, see Green et
al. 2004); it made comparisons between GenBank sequences
associated with each of the "pathway hole" enzymes and the ORF
sequences from CGD, to assign candidate Candida albicans genes to
these activities where possible.
The automatically generated pathway predictions are being manually reviewed and curated at CGD. Because the software intentionally overpredicts pathways (i.e., errs on the side of inclusion), the pathway set that was initially predicted contained a number of spurious and redundant pathways. Curators conducted an initial review of the pathway list to identify relevant literature for most of the pathways, and to remove most of the spurious predictions. When necessary, new Candida pathways have been added. Subsequently, each pathway is subject to more intensive review by CGD curators; any necessary updates are made to the pathway structure or reactions; links to any corresponding pathway(s) in SGD are formed; and literature relevant to the pathway in Candida albicans and other Candida is reviewed, summarized on the pathway page, and added to the pathway reference list. In many cases, a pathway may not be characterized directly in Candida, as indicated in the text of the summary on the pathway page.
Pathway curation at CGD is an ongoing process, and the pathway predictions contained in the database may be incomplete or incorrect. If you see any errors, please contact us.
The main query page is organized into several sections that allow you to search the Candida Biochemical Pathways dataset in different ways. The only dataset that is available is for the Candida albicans strain SC5314, which is the reference strain at CGD.
- The query box performs a keyword search of the information available in the biochemical pathways dataset at CGD. Use this option to search for a protein name, a pathway, a reaction, or a compound. (RNA searches are currently inoperable. Gene searches retrieve CGD Locus Summary Pages.) The keyword can be text or an E.C.#. The search will automatically add a wildcard character to the beginning and end of your keyword. It does not support Boolean searches. The search is specific for the Candida Biochemical Pathways dataset and will not search the rest of the database at CGD.
- If there is only one match to the keyword, the match will be displayed immediately. If there is more than one match, all the results will be listed.
Choose from a list of all...
- The pathways, E.C. numbered reactions, and compounds in the Pathway Tools are organized into hierarchies to show the relationship between items in each of these classes. (CGD is not using the Pathway Tools gene hierarchies for curation, so selecting the "Genes" ontology will yield a noninformative result; most of the gene products involved in biochemical pathways will be listed under the categories "unclassified" or "ORFs.")
- Selecting one of these ontologies and clicking "Submit" will display the categories of the ontology. You can browse the ontology by clicking on the name of any category or by using the +/- options to expand or collapse regions of the hierarchy. Each class will display the parents of that category as well as all children of that cateogory.
- The items listed are hyperlinks to the pathway, reaction, or compound page.
Cellular Overview Diagram/Omics viewer
- This option displays an alphabetical list of all the pathways, proteins, or compounds present in the database.
- The items on the list are hyperlinks. If you click on a pathway name, for example, you will be directed to the pathway page.
Links to summary information
- The Overview Diagram displays all the biochemical pathways and reactions that have been created in the CGD dataset.
- All shapes represent metabolic compounds of different classes. A legend is also available on the right side of the overview diagram.
- Triangle: amino acid
- Square: carbohydrate and derivatives
- Diamond: proteins and modified proteins
- Vertical ellipse: purines
- Horizontal ellipse: pyrimidines
- T: tRNA (not curated in Pathway Tools at CGD)
- Circle: all other compounds
- Filled in shape: phosphorylated compound
- Blue lines represent reactions that have been associated with a gene product. Reactions that have not been associated with a gene product are grayed out.
- The gray lines between metabolic compounds indicate where the same compound is present in a different reaction. For the sake of clarity, the gray lines are not exhaustive.
- The TCA cycle is shown near the center, with catabolic pathways on the right hand side and biosynthetic pathways on the left hand side. The flow of the pathway is from the top of the page to the bottom. On the far right side, reactions that are not part of currently curated pathways are listed.
- If you click on a compound, the pathway containing that compound will be displayed.
- The Omics Viewer is a feature of the Pathway Tools that allows you to superimpose expression data onto the metabolic Overview Diagram. With this feature, you are able to import a data file and display the absolute expression levels of metabolic enzymes, compare the expression of metabolic enzymes under different conditions, or create an animation of how expression levels change over time. The Pathway Tools provides a help document that includes information about file formats, analysis options, and interpreting results.
- The PathoLogic Pathway Analysis page contains links to two web pages, the Pathway Report and the Pathway Holes Report. The Pathway Report provides a visual overview of the basis for prediction of each pathway in current dataset. The Pathway Holes Report lists all reactions that lie within predicted pathways and which were not associated with a gene product at the time the pathway predictions were made. The report lists reactions that were subsequently assigned to a gene product by the "Pathway Hole Filler" software within Pathway Tools, and it separately lists those reactions that were not assigned to any candidate gene and therefore remain associated with no gene product.
- Because CGD only contains pathway data for Candida, no inter-species comparisons are available. However, this tool supports generation of various tables summarizing data from the pathway dataset. Please note that the pathway dataset is not completely representative of the genome itself, not all types of data that are presented in these tables have been curated in the dataset, and some types of data are not applicable.
- Please also note that there are buttons labeled "Cross-species Comparison" (found on some of the pathway-related web pages), but that these functions are not operational because CGD only contains a single pathway dataset.
The Advanced Query form allows you to retrieve specific types of information from the Candida Biochemical Pathways. The data structure of the Candida Biochemical Pathways is shared with EcoCyc and other datasets created using the Pathway Tools. The Pathway Tools web site provides a detailed help document for the Advanced Query.
Many items on the pathway diagrams are hyperlinks to detail pages; these links are underlined or they appear as colored text.
- Each pathway is shown on a single page.
- There are 5 levels of detail at which each pathway may be viewed. Use the "More detail" and "Less detail" buttons at the top of the page to browse the pathway detail. If the pathway contains many reactions, only intermediate compounds and reaction arrows are displayed initially. If the pathway contains only a few reactions, more detailed information, such as gene name and enzyme name, are shown as the default.
- At the most detailed level of view, the compound structure for each intermediate in the pathway, the compound structure for all cofactors in the reaction, the gene name, and the enzyme name are all displayed on the pathway diagram.
- Below the pathway diagram, several fields provide addditional context for the pathway. The "Superclasses" field lists the pathway ontology assignments for the pathway. Any sub-pathways that make up the current pathway, or any pathway that contains the current pathway within it, are listed under "Subpathways" or "Superpathways," respectively.
- Any synonyms for the pathway are listed.
- The "Summary" contains additional, curated information about the pathway, with references. CGD curators are in the process of writing summaries for each pathway. Pathways in CGD that correspond to pathways in SGD may also display the SGD pathway summary. CGD thanks SGD for their permission to include this information on our pathway displays.
- The "Unification Links" section displays links to corresponding pathway(s) in other databases, SGD in particular.
- "If an enzyme name is shown in bold, there is experimental evidence for this enzymatic activity": Please note that CGD is not routinely using this feature. The evidence supporting the assignment of each enzyme activity is best assessed by visiting the CGD Locus page to peruse the Gene Ontology assignments, which each include a code that specifies the type of evidence used to make the annotation, and the reference from which the evidence was curated. Enzyme assignments made by the Pathway Hole Filler software are denoted upon mouseover of the gold enzyme name on the pathway diagram, by a note that says "Inferred computationally without human oversight [Green 04]."
- The "Pathway Evidence Glyph" displays an icon representing the current pathway, which is color-coded to represent the evidence used by the Pathway Tools software to predict that the pathway occurs in Candida albicans.
- At the bottom of the page, the reference list is displayed. Each of the entries is a hyperlink to the corresponding PubMed record.
- Each reaction is displayed on its own page, with the E.C. number of the reaction at the top.
- The reaction page lists details about the chemical reaction, including the following:
- E.C. category.
- Name of the enzyme that catalyzes the reaction.
- Gene name.
- Pathways that contain this reaction.
- The compounds and cofactors involved in the reaction, with their chemical structures.
- The Unification Links section provides a hyperlink to the corresponding entry on the ExPASy website.
- Gene-reaction schematic:
- The gene-reaction schematic is a visual representation of the relationship among a set of genes, enzymes, and reactions.
- The blue boxes on the left hand side represent the reactions, the purple boxes on the right hand side represent genes, and the gold circles in the middle represent polypeptides or protein complexes. Depending on which page you are viewing, the appropriate box or circle will be filled in. For example, when viewing the schematic from a reaction page, the reaction box is highlighted.
- Each of the boxes and circles is a hyperlink to the corresponding reaction page, enzyme page, or CGD Locus page.
- The lines represent the relationship between these objects. A line from the gene to a circle indicates that the gene codes for that polypeptide. A line from a circle to another circle indicates that the gene product is a subunit of that complex. A line from the circle to the reaction indicates that the reaction is catalyzed by that polypeptide or protein complex.
- The enzyme page lists the name of the enzyme, any synonyms or alternative enzyme names, the gene that codes for the polypeptide, the gene-reaction schematic (described above, under Reactions), the reaction(s) catalyzed by the enzyme, and the pathways that contain the current reaction.
- If the enzyme catalyzes multiple reactions, they will all be listed.
- The compound page lists the common name of the chemical, any synonyms, the empirical formula, the molecular weight, the structure (if available), and the Smiles string for the chemical compound. (Smiles is an alternative nomenclature to describe chemical structures.)
- The Unification Links section provides hyperlinks to corresponding entries for the compound in other databases, including ChEBI, Ligand (at KEGG), and PubChem.
- Also listed are reactions in which the chemical compound is a reactant or a product, and the pathways that contain these reactions.
Tools provides a help
document that includes information about file formats, analysis
options, and interpreting results.
Since the Candida Biochemical Pathways dataset only contains genes that are involved in metabolic pathways, only these genes in the dataset will be taken into consideration. In addition, if a metabolic enzyme has not been associated with a reaction, the results will not be shown on the Overview Diagram. As a consequence, the statistics produced by the Pathway Tools may be skewed.
|Return to CGD||Send a Message to the CGD Curators|