Contents
The CGD locus page is divided into three sections: BASIC
INFORMATION, RESOURCES, and ADDITIONAL INFORMATION, as described below.
The Basic Information section lists any names and aliases (synonyms, including allele names) for a
particular locus, and indicates whether a name is standard or
reserved. Additional information about gene names in CGD (including explanation of the standard format, systematic name format, and other gene identifiers used) may be found on the Nomenclature page. The Basic Information Section also includes the following additional information:
The ortholog mappings between C. albicans and
C. dubliniensis were provided to CGD by John
Gamble and Matthew Berriman at the Wellcome Trust Sanger
Institute. The C. dubliniensis orthologs are manually curated
positional orthologs based on synteny (rather than on sequence
similarity). Assembly 20 of the C. albicans SC5314 genome
sequence was used for positional ortholog determination.
The ortholog mappings between C. albicans and
S. cerevisiae were generated using the InParanoid
software (version 3.0) developed at the Karolinska Institutet.
To run InParanoid, the haploid complement of C. albicans proteins from
CGD was compared to the latest set of S. cerevisiae proteins from SGD
(as of April 7th, 2008), and the set of C. elegans proteins from the
Sanger Institute, in wormpep 188, was used as an outgroup. Stringent
cutoffs were set: BLOSUM80 (instead of the default BLOSUM62), and an
InParanoid score of 100%. In total, 3453 ortholog mappings met these
criteria. BLAST version 2.2.18 was used by InParanoid.
Note, that the ortholog pairings are automatically generated, with no
curator intervention. Thus, there will occasionally be pairings that
may not occur with a different scoring matrix. In the interest of
automating the process, we do not intend to hand-curate the ortholog
pairs at this time.
Please also note that, in the Assembly 21-based mapping, orthologs are
not computed or displayed for the C. albicans ORFs that were present in
a prior assembly (Assembly 19 or 20) and subsequently deleted from Assembly 21. The
orthologs of these ORFs are present in the archived Assembly 19 or 20-based mapping
files.
For C. albicans proteins that did not have an ortholog that meets
these criteria, we used BLASTp, using the same parameters as were used
by InParanoid (-F \"m S\" -M BLOSUM80) with an expectation value (E)
of 1e-5 to identify their best hit in the S. cerevisiae protein
complement. Best hits were identified for 1373 of the C. albicans
proteins.
Orthologs and Best Hits are displayed on the CGD Locus pages, and
linked to the corresponding SGD Locus pages. In addition, all of this
information is available in bulk in files that are available from the /Assem21orthologs/
download directory on the CGD web site. (If you are interested in the archived Assembly
19-based ortholog mapping (from August 15, 2005), or the Assembly
20-based ortholog mapping (from November 26, 2006), these are also available for download from our web site.
page).
Basic Information
The Description field lists general information about the
gene.
This section provides orthologs and best hit mappings of
C. albicans genes to C. dubliniensis and S. cerevisiae.
The GO annotations describe a gene's molecular functions, its role
in biological processes, and its presence in cellular components or
complexes. Each annotation links to a page showing all genes
annotated to the term in CGD. The GO annotations use a controlled vocabulary
that allow powerful searches within CGD and across other
databases. The "Gene Ontology" tab and the "GO evidence and references" link will display a page with more information about the GO assignments, including links to the references that support assignment of each GO term.
The Phenotype field lists the mutant
phenotype for the gene. The "Phenotype" tab and "Phenotype details
and references" links display a page with additional information about the phenotypes, including references and additional notes.
The Pathways field lists any biochemical
pathways in which the gene product is predicted to participate. Each
pathway name is hyperlinked to the corresponding pathway diagram.
The Chromosomal Location field
indicates the location of the gene in Assembly 21 and Assembly 20 of
the C. albicans genome. The coordinates of each exon and
intron are listed, with the relative position of these regions with
respect to the feature, and to the chromosome. Also displayed are the dates on which the
most recent updates to the sequence coordinates and to the sequence
itself were made. Dates that appear on many pages:
Please note: Gaps
have been introduced into some ORFs to make adjustments to the reading
frame or to eliminate stop codons, in cases in which the annotator
judged that the sequence was likely to be in error. These gaps called
"Adjustments," rather than "Introns" on the Locus page of the affected
ORFs in Assemblies 20 and 21. (All introns and adjustments are called
"Gaps" in Assembly 19.) Some of the gaps have a length that is a
negative number; that is, the coding sequence comprises two
overlapping segments, such that some sequence is counted twice. The
"adjustments" should be considered flags that indicate that
resequencing of the area is advised.
The Contig Location field lists the name of the contig (in Contig19 nomenclature, from Assembly 19 of the genomic sequence) and the base pair coordinates within the contig at which each allele resides. The name of each allele is a hyperlink to the GBrowse genome browser, which allows viewing and navigation of the contig.
This field provides links to other information sources for the gene.
This is an identifier that has been
assigned to the locus at CGD. Resources
Allows you to view and navigate the chromosomes from Assembly 21 (or Assembly 20) of the genome sequence, and the contigs from Assembly 19. GBrowse may be accessed using the GBrowse map thumbnail views or the "Contig Location(s)" links. See the ACT1 Locus Page as an example. The three small maps on the right-hand side of the page correspond to the ACT1 ORF in Assembly 21 and, below, the two ACT1 alleles from Assembly 19. (A hyperlink to Assembly 20 GBrowse is also available; however, no map thumbnail view is displayed for Assembly 20.) Clicking on one of these will display a region of the chromosome or contig, with the selected ORF at the center and neighboring ORFs visible on either side. GBrowse is described in more detail on the GBrowse Help Documentation page.
Retrieves, for the gene in Assembly 21 (or 20), or for each allele in
Assembly 19, the Genomic DNA (with introns), the Exons-only Sequence
(Coding Sequence of an ORF, or the sequence with exons removed of any
non-translated feature) , the Genomic DNA with 1 kb of flanking sequence upstream and downstream of the gene (also includes any introns), or the ORF
translation. For more details about the sequence datasets, please see the Sequence Documentation page.
Allow comparison of the genomic, coding, or ORF translation sequence
to various C. albicans sequence datasets. For more detailed
instructions for use of the BLAST tool, please see the BLAST Search Help page. For a detailed
description of the search output, please see the BLAST Results Help page.
Recommends primers appropriate for either PCR or sequencing of a
given sequence, within configurable parameters. For more
information about this tool, please see the Web Primer
tool help page.
Generates a restriction map of a specified DNA sequence. The
restriction map may include all enzymes, or a subset of enzyme
types (3' overhangs, 5' overhangs, blunt ends, or enzymes that cut
once or twice). For more
information about this tool, please see the Restriction Analysis tool help page.
The Flanking Features Table
provides a tabular view of the other nearby features (ORFs, etc.) on
the Assembly 21 chromosome, with brief descriptions of each
neighboring feature and links to additional resources and
information. By default, the table lists features within 5 kb flanking
the locus; use the "Retrieve Features Left" or "Retrieve Features
Right" options near the bottom of the page to include additional
flanking region. The column labeled "Retrieve" provides links to the
Locus page and to the Assembly 21 nucleotide sequence of each feature
listed in the table.
Documentation
Additional Information
Return to CGD |
Send a Message to the CGD Curators ![]() |