Index of /download/gff/C_albicans_SC5314
Name Last modified Size Description
Parent Directory -
5prime_utr_intron_A20.gff 17-May-2007 09:37 6.1K
5prime_utr_intron_A21.gff 26-Aug-2010 13:16 6.6K
A19_ForcheSNPs.gff 31-Aug-2007 13:56 82K
A20_ForcheSNPs.gff 31-Aug-2007 13:56 80K
A21_ForcheSNPs.gff 26-Aug-2010 13:16 90K
Assem19mapping.gff 12-Apr-2007 12:23 1.8M
Assem20mapping.gff 13-Aug-2008 10:16 5.9M
Assem21mapping.gff 26-Aug-2010 13:17 7.5M
C_albicans_SC5314_version_A19-s01-m03-r01_features.gff 03-Oct-2011 14:06 4.0M
C_albicans_SC5314_version_A19-s01-m03-r01_features_with_chromosome_sequences.gff.gz 03-Oct-2011 14:08 8.6M GZIP compressed docume>
C_albicans_SC5314_version_A21-s02-m01-r18_features.gff 29-Jan-2012 07:04 3.2M
C_albicans_SC5314_version_A21-s02-m01-r18_features_with_chromosome_sequences.gff.gz 29-Jan-2012 07:05 4.7M GZIP compressed docume>
C_albicans_SC5314_version_A21-s02-m01-r18_intergenic.gff 29-Jan-2012 08:12 1.3M
Unannotated_transcripts_Bruno_et_al.gff 11-Oct-2010 14:29 132K
Unannotated_transcripts_Sellam_et_al.gff 17-Sep-2010 14:18 380K
Unannotated_transcripts_Tuch_et_al_2010.gff 11-Oct-2010 14:29 270K
archive/ 29-Jan-2012 08:12 -
candida_20.gff 06-Oct-2008 21:10 2.0M
suspect_WO1_regions_reduced.gff 05-Jun-2007 13:02 42K
This directory contains downloadable GFF files for the genome of C. albicans SC5314.
These features described in the files include chromosomes, Contig19s, ORFs, tRNAs,
centromeres, sequence gaps, etc.
Please see http://song.sourceforge.net/gff3.shtml for a detailed description of the Generic
Feature Format (GFF).
The file C_albicans_SC5314_A21_features.gff contains the current CGD annotation
based on Assembly 21 of the C. albicans SC5314 genome sequence. It is updated weekly.
The file candida_21_with_chromosome_sequences.gff.gz contains the current CGD annotation
and the current genomic sequence of all chromosomes based on Assembly 21 of the
C. albicans SC5314 genome sequence. The annotations in this file and
C_albicans_SC5314_A21_features.gff above are the same. The chromosome sequences are specified
in the "##FASTA" section at the end of this file according to GFF3 file format
specifications. This file is updated weekly.
The file C_albicans_SC5314_A21_intergenic.gff lists the intergenic regions between coding
regions in Assembly 21. The file also contains lengths of these intergenic sequences
and percent of GC and AT contents. It is updated weekly.
The file
candida_20.gff
contains the CGD annotation based on Assembly 20 of the
C. albicans genome sequence. It is NOT updated after Oct 6 2008.
The file
suspect_WO1_regions_reduced.gff
contains all of the regions that were flagged by the BRI as potentially derived from strain
WO-1, rather than the reference strain, SC5314. CGD compared the 1kb flanking parts of each
suspect region to Contig19 sequences and iteratively reduced the flanking region from the gap
end by 100 bp (10%) until either there was 100% identity with the contig, or 100% could
not be found. Thus, the section of the
flanking region which aligns perfectly with the contig has been removed from the suspect
list. These are the regions that appear with the label "Suspect WO1" in the Genome Browser on
the CGD web site. Please see http://www.candidagenome.org/help/Assembly20_Advisory.shtml for
additional details.
The file
candida_19.gff
contains CGD annotation from Assembly 19 of the C. albicans
genome sequence. This file is a snapshot of current
annotation as of September 2006, immediately before the
data from Assembly 20 was loaded into CGD. This file is
archival; it will not be updated.
The file
Assem19mapping.gff
contains mappings of historic assemblies to Assembly 19 super contigs. BLAST analysis was
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 19
supercontigs. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.
The file
Assem20mapping.gff
contains mappings of historic assemblies to Assembly 20 chromosomes. BLAST analysis was
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 20
chromsomes. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.
The file
Assem21mapping.gff
contains mappings of historic assemblies to Assembly 21 chromosomes. BLAST analysis was
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 21
chromsomes. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.
The file
A19_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were
mapped to Assembly 19 contigs using the original marker sequences.
The file
A20_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were
mapped to Assembly 20 chromosomes using the original marker sequences.
The file
A21_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were
mapped to Assembly 21 chromosomes using the original marker sequences.
The file
A21_gaps_Nruns.gff
lists stretches of continuous unknown bases, denoted as N, that are 3 or more bases long anywhere on A21 chromosomes.
The file
5prime_utr_intron_A20.gff or
5prime_utr_intron_A21.gff
contains the 5' UTR intron data published in the paper Mitrovich QM, Tuch BB, Guthrie
C, Johnson AD. Computational and experimental approaches double the number
of known introns in the pathogenic yeast Candida albicans. Genome Res.
2007 Apr;17(4):492-502. These introns are mapped to Assembly 20 or Assembly 21, respectively.
The file
Unannotated_transcripts_Sellam_et_al.gff
contains novel, unannotated transcripts detected in tiling microarray experiments from
Sellam A, Hogues H, Askew C, Tebbji F, van het Hoog M, Lavoie H, Kumamoto CA, Whiteway M, Nantel A
Experimental annotation of the human pathogen Candida albicans coding and noncoding
transcribed regions using high-resolution tiling arrays. Genome Biol 2010; 11(7):R71.
The file
Unannotated_transcripts_Tuch_et_al_2010.gff
contains novel, unannotated transcriptionally active regions detected by strand-specific
sequencing of RNA from white and opaque cells, described in Tuch BB, Mitrovich QM, Homann OR,
Hernday AD, Monighetti CK, De La Vega FM, Johnson AD (2010) The transcriptomes of two heritable
cell types illuminate the circuit governing their differentiation. PLoS Genet 6(8)
The file
Unannotated_transcripts_Bruno_et_al_2010.gff
contains novel transcriptionally active regions detected in high-throuhgput sequencing of
cDNA (RNA-seq) under several environmental conditions, described in Bruno VM, Wang Z,
Marjani SL, Euskirchen GM, Martin J, Sherlock G, Snyder M (2010) Comprehensive annotation
of the transcriptome of the human fungal pathogen Candida albicans using RNA-seq.
Genome Res 20(10):1451-8