Index of /download/gff/C_albicans_SC5314

Icon  Name                                                                                Last modified      Size  Description
[DIR] Parent Directory - [   ] 5prime_utr_intron_A20.gff 17-May-2007 09:37 6.1K [   ] 5prime_utr_intron_A21.gff 26-Aug-2010 13:16 6.6K [   ] A19_ForcheSNPs.gff 31-Aug-2007 13:56 82K [   ] A20_ForcheSNPs.gff 31-Aug-2007 13:56 80K [   ] A21_ForcheSNPs.gff 26-Aug-2010 13:16 90K [   ] Assem19mapping.gff 12-Apr-2007 12:23 1.8M [   ] Assem20mapping.gff 13-Aug-2008 10:16 5.9M [   ] Assem21mapping.gff 26-Aug-2010 13:17 7.5M [TXT] C_albicans_SC5314_version_A19-s01-m03-r01_features.gff 03-Oct-2011 14:06 4.0M [   ] C_albicans_SC5314_version_A19-s01-m03-r01_features_with_chromosome_sequences.gff.gz 03-Oct-2011 14:08 8.6M GZIP compressed docume> [TXT] C_albicans_SC5314_version_A21-s02-m01-r18_features.gff 29-Jan-2012 07:04 3.2M [   ] C_albicans_SC5314_version_A21-s02-m01-r18_features_with_chromosome_sequences.gff.gz 29-Jan-2012 07:05 4.7M GZIP compressed docume> [TXT] C_albicans_SC5314_version_A21-s02-m01-r18_intergenic.gff 29-Jan-2012 08:12 1.3M [   ] Unannotated_transcripts_Bruno_et_al.gff 11-Oct-2010 14:29 132K [   ] Unannotated_transcripts_Sellam_et_al.gff 17-Sep-2010 14:18 380K [   ] Unannotated_transcripts_Tuch_et_al_2010.gff 11-Oct-2010 14:29 270K [DIR] archive/ 29-Jan-2012 08:12 - [   ] candida_20.gff 06-Oct-2008 21:10 2.0M [   ] suspect_WO1_regions_reduced.gff 05-Jun-2007 13:02 42K
This directory contains downloadable GFF files for the genome of C. albicans SC5314.
These features described in the files include chromosomes, Contig19s, ORFs, tRNAs,
centromeres, sequence gaps, etc.

Please see http://song.sourceforge.net/gff3.shtml for a detailed description of the Generic
Feature Format (GFF).

The file C_albicans_SC5314_A21_features.gff contains the current CGD annotation
based on Assembly 21 of the C. albicans SC5314 genome sequence. It is updated weekly.

The file candida_21_with_chromosome_sequences.gff.gz contains the current CGD annotation
and the current genomic sequence of all chromosomes based on Assembly 21 of the
C. albicans SC5314 genome sequence. The annotations in this file and
C_albicans_SC5314_A21_features.gff above are the same. The chromosome sequences are specified 
in the "##FASTA" section at the end of this file according to GFF3 file format 
specifications. This file is updated weekly.

The file C_albicans_SC5314_A21_intergenic.gff lists the intergenic regions between coding
regions in Assembly 21.  The file also contains lengths of these intergenic sequences
and percent of GC and AT contents. It is updated weekly.

The file 
candida_20.gff 
contains the CGD annotation based on Assembly 20 of the 
C. albicans genome sequence.  It is NOT updated after Oct 6 2008.

The file 
suspect_WO1_regions_reduced.gff 
contains all of the regions that were flagged by the BRI as potentially derived from strain
WO-1, rather than the reference strain, SC5314.  CGD compared the 1kb flanking parts of each
suspect region to Contig19 sequences and iteratively reduced the flanking region from the gap
end by 100 bp (10%) until either there was 100% identity with the contig, or 100% could 
not be found. Thus, the section of the
flanking region which aligns perfectly with the contig has been removed from the suspect
list. These are the regions that appear with the label "Suspect WO1" in the Genome Browser on
the CGD web site.  Please see http://www.candidagenome.org/help/Assembly20_Advisory.shtml for
additional details.


The file 
candida_19.gff 
contains CGD annotation from Assembly 19 of the C. albicans 
genome sequence.  This file is a snapshot of current 
annotation as of September 2006, immediately before the 
data from Assembly 20 was loaded into CGD.  This file is 
archival; it will not be updated. 


The file
Assem19mapping.gff
contains mappings of historic assemblies to Assembly 19 super contigs. BLAST analysis was 
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 19 
supercontigs. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.


The file
Assem20mapping.gff
contains mappings of historic assemblies to Assembly 20 chromosomes. BLAST analysis was 
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 20 
chromsomes. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.


The file
Assem21mapping.gff
contains mappings of historic assemblies to Assembly 21 chromosomes. BLAST analysis was 
performed to map Contigs and ORF sequences from each of the older assemblies to the Assembly 21 
chromsomes. Please see http://candidagenome.org/download/mapping_historic_assemblies/
for further details on the analysis procedure and separate mapping files for individual assemblies.


The file
A19_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were 
mapped to Assembly 19 contigs using the original marker sequences.


The file
A20_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were 
mapped to Assembly 20 chromosomes using the original marker sequences.


The file
A21_ForcheSNPs.gff
contains all the SNP locations from Forche A, Magee PT, Magee BB, May G Genome-wide single-nucleotide
polymorphism map for Candida albicans. Eukaryotic Cell. 2004 Jun;3(3):705-14. SNP locations were 
mapped to Assembly 21 chromosomes using the original marker sequences.


The file
A21_gaps_Nruns.gff
lists stretches of continuous unknown bases, denoted as N, that are 3 or more bases long anywhere on A21  chromosomes.


The file
5prime_utr_intron_A20.gff or
5prime_utr_intron_A21.gff
contains the 5' UTR intron data published in the paper Mitrovich QM, Tuch BB, Guthrie 
C, Johnson AD.  Computational and experimental approaches double the number 
of known introns in the pathogenic yeast Candida albicans.  Genome Res. 
2007 Apr;17(4):492-502.  These introns are mapped to Assembly 20 or Assembly 21, respectively.


The file
Unannotated_transcripts_Sellam_et_al.gff
contains novel, unannotated transcripts detected in tiling microarray experiments from 
Sellam A, Hogues H, Askew C, Tebbji F, van het Hoog M, Lavoie H, Kumamoto CA, Whiteway M, Nantel A 
Experimental annotation of the human pathogen Candida albicans coding and noncoding  
transcribed regions using high-resolution tiling arrays. Genome Biol 2010; 11(7):R71.

The file
Unannotated_transcripts_Tuch_et_al_2010.gff
contains novel, unannotated transcriptionally active regions detected by strand-specific
sequencing of RNA from white and opaque cells, described in Tuch BB, Mitrovich QM, Homann OR,
Hernday AD, Monighetti CK, De La Vega FM, Johnson AD (2010) The transcriptomes of two heritable
cell types illuminate the circuit governing their differentiation. PLoS Genet 6(8)

The file
Unannotated_transcripts_Bruno_et_al_2010.gff
contains novel transcriptionally active regions detected in high-throuhgput sequencing of 
cDNA (RNA-seq) under several environmental conditions, described in Bruno VM, Wang Z,  
Marjani SL, Euskirchen GM, Martin J, Sherlock G, Snyder M (2010) Comprehensive annotation  
of the transcriptome of the human fungal pathogen Candida albicans using RNA-seq. 
Genome Res 20(10):1451-8