The collaborative group who generated Assembly 20 has discovered that the sequence traces that they had been using to fill some of the gaps and determine overlaps between Assembly 19 contigs were derived from strain WO-1, rather than from the reference strain, SC5314. The sequence of these regions are consequently expected to be inaccurate where WO-1 sequence has been used, and there also exists the chance that small contigs have been misassembled based on the WO-1 sequence data.
The Biotechnology Research Institute of the National Research Council of Canada released a list of the regions affected. By comparing the 1kb flanking parts for each suspect region against the Contig19 sequences, CGD was able to reduce the size of many of the suspect regions. In CGD, these regions are displayed in the Assembly 20 Genome Browser (GBrowse) as orange-colored regions entitled "Suspect_WO1" followed by a number assigned sequentially to the problematic gap regions, 193 in total. (Please see example link.) These changes are reflected in the downloadable GFF files as well as in the Genome Browser display.
A list of the reduced regions and their chromosomal locations may be downloaded. A list of the ORFs that
are affected by
the regions may be also be downloaded.
A list of the original regions and their chromosomal locations may be downloaded. A list of the ORFs that
are affected by
these regions may also be downloaded.
The physical mapping data are now available from the University of Minnesota, at http://albicansmap.ahc.umn.edu/index.html. The optical mapping data have been made available by P.T. Magee, and are now archived at CGD. The mapping data, which were used to order and orient contigs, originate exclusively from the reference strain SC5314, and may be downloaded.
To ensure that you are working only with sequence from the reference strain SC5314, you may retrieve data from Assembly 19 or Assembly 21 instead of Assembly 20. Please feel free to contact us with any further questions.
Additional detailed information about Assembly 20, including other known issues, is available on the CGD Sequence Documentation web page.
Return to CGD |
Send a Message to the CGD Curators ![]() |