Index of /download/GFF

Icon  Name                            Last modified      Size  Description
[DIR] Parent Directory - [   ] 5prime_utr_intron_A20.gff 17-May-2007 09:37 6.1K [   ] 5prime_utr_intron_A21.gff 20-Sep-2007 16:08 6.1K [   ] A19_ForcheSNPs.gff 31-Aug-2007 13:56 82K [   ] A20_ForcheSNPs.gff 31-Aug-2007 13:56 80K [   ] A21_ForcheSNPs.gff 31-Aug-2007 09:26 80K [   ] A21_gaps_Nruns.gff 14-Sep-2007 14:22 1.3K [   ] Assem19mapping.gff 12-Apr-2007 12:23 1.8M [   ] Assem20mapping.gff 02-May-2007 18:09 5.6M [   ] Assem21mapping.gff 13-Sep-2007 16:10 6.5M [   ] candida_19.gff 12-May-2008 21:01 3.6M [   ] candida_20.gff 12-May-2008 21:10 2.0M [   ] candida_21.gff 12-May-2008 21:41 2.0M [   ] suspect_WO1_regions_reduced.gff 05-Jun-2007 13:02 42K
This directory contains the downloadable CGD GFF files.  These files
describe features in CGD, including chromosomes, Contig19s, ORFs, tRNAs, centromeres, sequence
gaps, etc.

Please see http://song.sourceforge.net/gff3.shtml for a detailed description of the Generic
Feature Format (GFF).

The file 
candida_21.gff 
contains the current CGD annotation based on Assembly 21 of the 
C. albicans genome sequence.  It is updated weekly.

The file 
candida_20.gff 
contains the current CGD annotation based on Assembly 20 of the 
C. albicans genome sequence.  It is updated weekly.

The file 
suspect_WO1_regions_reduced.gff 
contains all of the regions that were flagged by the BRI as potentially derived from strain
WO-1, rather than the reference strain, SC5314.  CGD compared the 1kb flanking parts of each
suspect region to Contig19 sequences and iteratively reduced the flanking region from the gap
end by 100 bp (10%) until either there was 100% identity with the contig, or 100% could 
not be found. Thus, the section of the
flanking region which aligns perfectly with the contig has been removed from the suspect
list. These are the regions that appear with the label "Suspect WO1" in the Genome Browser on
the CGD web site.  Please see http://www.candidagenome.org/help/Assembly20_Advisory.shtml for
additional details.


The file 
candida_19.gff 
contains CGD annotation from Assembly 19 of the C. albicans 
genome sequence.  This file is a snapshot of current 
annotation as of September 2006, immediately before the 
data from Assembly 20 was loaded into CGD.  This file is 
archival; it will not be updated. 


The file
Assem19mapping.gff
contains mappings of historic assemblies to Assembly 19 super contigs. BLAST analysis was 
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 19 
supercontigs. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.


The file
Assem20mapping.gff
contains mappings of historic assemblies to Assembly 20 chromosomes. BLAST analysis was 
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 20 
chromsomes. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.


The file
Assem21mapping.gff
contains mappings of historic assemblies to Assembly 21 chromosomes. BLAST analysis was 
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 21 
chromsomes. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.


The file
A19_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were 
mapped to Assembly 19 contigs using the original marker sequences.


The file
A20_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were 
mapped to Assembly 20 chromosomes using the original marker sequences.


The file
A21_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were 
mapped to Assembly 21 chromosomes using the original marker sequences.


The file
A21_gaps_Nruns.gff
lists stretches of continuous unknown bases, denoted as N, that are 3 or more bases long anywhere on A21  chromosomes.


The file
5prime_utr_intron_A20.gff or
5prime_utr_intron_A21.gff
contains the 5' UTR intron data published in the paper Mitrovich QM, Tuch BB, Guthrie 
C, Johnson AD.  Computational and experimental approaches double the number 
of known introns in the pathogenic yeast Candida albicans.  Genome Res. 
2007 Apr;17(4):492-502.  These introns are mapped to Assembly 20 or Assembly 21, respectively.