Contents
Gene naming guidelines
Naming genes will typically be left
to the researchers who study them, and annotators will generally not
spend time deciding on the name for genes. For sets of genes that
will likely not be characterized again in Candida (eg. metabolic
enzymes already well characterized in Candida), a set of
"suggested" gene names based on homology to Candida genes will
be listed on the web site. In general, published names will be the
criteria for making a common gene name the standard one for an ORF.
Annotation tools
A demo of the Artemis annotation tool and a brief introduction to GO
was given by Matt Berriman.
The latest assembly (Super Assembly 19) will be released in the next week or so by Stew, Ted, and the rest of the sequencing group.
Christophe d'Enfert will use the new assembly to make a second list of ORFs that expands the ORF set to include ORFs between 40 and 100 amino acids. All ORFs > 40 amino acids from stop to stop will initially be considered. Gene Mark (codon bias), and perhaps Gene Finder (introns) will be used to refine the ORF list.
These ORFs will be mapped to Assembly 6 ORFs and compared to the ORFs found by the Stanford group. In addition, information from Filemaker Pro, Candida DB, and other available databases will be merged and incorporated, along with the sequence data, into 270 EMBL-formatted files, one file per contig. These EMBL files then can be read by the Artemis program, which will be used for distribution of labor among the annotators.
As a consistency test, all annotators will first independently look at the same set of roughly 50 ORFs to compare to others' annotation. Because of the complexity of the task and the time differences, meeting in person is highly desirable when this stage is reached. EMBL files will be deposited, synchronized, and distributed by CGD to annotators.
Because the task of identifying ORFs is very time consuming in itself, very little manual secondary curation (eg. GO curation by reading papers) will be done. One idea is to do a BLAST of Candida and Candida to transfer GO annotations used in Candida. These will initally have the IEA (inferred from electronic annotation) evidence code. If annotators "approve" the association, which can be easily done in Artemis, they can change the evidence code to ISS (inferred from sequence similiarity); for more on GO, see the GO home page.
Return to CGD |
Send a Message to the CGD Curators ![]() |