Index of /download/Assembly19notes
Name Last modified Size Description
Parent Directory -
A19graphs-pdf.tar.gz 2009-02-02 10:32 195K
contains archived copies of the PDF diagrams showing the assembly of Contig19's from Contig6's. These files were originally made available from the Candida web server at the Stanford Genome Technology Center.
Sequencing of the Candida albicans genome is described in:
Jones, T., Federspiel, N.A., Chibana, H., Dungan, J., Kalman, S., Magee, B.B., Newport, G., Thorstenson, Y.R., Agabian, N., Magee, P.T., Davis, R.W. and S. Scherer. (2004) The Diploid Genome of <i>Candida albicans</i>. PNAS 101:7329-7334.
Documentation from the SGTC's <i>Candida</i> information server, archived here (verbatim) for reference:
"When we were able to detect separation of alleles in assembly 6, we combined the affected assembly 6 contigs into larger diploid contigs in assembly 19. All contigs so formed were assigned numbers starting at 10000; for example, Contig19-10014 is made up from contigs 6-1076, 6-2434, 6-1473, 6-1632, 6-2141, and 6-2001. A diagram is provided in PDF format for Contig19-10014 (and all others) showing how it is formed from assembly 6 contigs. A dotted line separates the assembly 6 contigs assigned to the two alleles. In regions where one allele has a gap, the sequence is presumed to be homozygous and is filled in from the other allele. Otherwise the top allele derives its sequence from the assembly 6 contig shown above the dotted line, and the bottom allele from the contig at the same position shown below the line. This process results in two sequences representing the two alleles for the contig. The top allele is arbitrarily designated as primary, and the sequence given for Contig19-10014 is that derived from the top set of assembly 6 contigs. The sequence for the other allele is given the name Contig19-20014 (i.e., add 10000 to the number of the primary allele). In viewing the diagrams, note that because of insertions and deletions between alleles, corresponding poisitions on the two alleles are not always connected by a direct vertical line, but usually in large diploid contigs the size of insertions is visually negligible."
Additional sequence documentation, including the Assembly 19 and Assembly 6 documentation provided by the the Stanford Genome Technology Center, is found on the CGD web site at:
The Assembly 19 Contig Diagram files are gzip
compressed. There are several freely available software options for
decompressing gzipped files using Windows. The software and other
useful information is available on these web sites:
- WinZip (http://www.winzip.com/)
- Stuffit (http://www.stuffit.com/)
- Gzip (http://www.gzip.org/
and the gzip user's manual: