Abstract
As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the B. napus cultivar Darmor-bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor-bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in the future production of a B. napus pangenome. This article is protected by copyright. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 1-9 |
Number of pages | 9 |
Journal | Plant Biotechnology Journal |
Volume | 15 |
Issue number | 12 |
DOIs | |
Publication status | E-pub ahead of print - 14 Jun 2017 |
Fingerprint
Cite this
}
Assembly and comparison of two closely related Brassica napus genomes. / Bayer, Philipp E; Hurgobin, Bhavna; Golicz, Agnieszka A.; Chan, Chon-Kit Kenneth; Yuan, Yuxuan; Lee, Hueytyng; Renton, Michael; Meng, Jinling; Li, Ruiyuan; Long, Yan; Zou, Jun; Bancroft, Ian; Chalhoub, Boulos; King, Graham J; Batley, Jacqueline; Edwards, David.
In: Plant Biotechnology Journal, Vol. 15, No. 12, 14.06.2017, p. 1-9.Research output: Contribution to journal › Article
TY - JOUR
T1 - Assembly and comparison of two closely related Brassica napus genomes
AU - Bayer, Philipp E
AU - Hurgobin, Bhavna
AU - Golicz, Agnieszka A.
AU - Chan, Chon-Kit Kenneth
AU - Yuan, Yuxuan
AU - Lee, Hueytyng
AU - Renton, Michael
AU - Meng, Jinling
AU - Li, Ruiyuan
AU - Long, Yan
AU - Zou, Jun
AU - Bancroft, Ian
AU - Chalhoub, Boulos
AU - King, Graham J
AU - Batley, Jacqueline
AU - Edwards, David
N1 - This article is protected by copyright. All rights reserved.
PY - 2017/6/14
Y1 - 2017/6/14
N2 - As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the B. napus cultivar Darmor-bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor-bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in the future production of a B. napus pangenome. This article is protected by copyright. All rights reserved.
AB - As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the B. napus cultivar Darmor-bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor-bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in the future production of a B. napus pangenome. This article is protected by copyright. All rights reserved.
KW - Journal Article
U2 - 10.1111/pbi.12742
DO - 10.1111/pbi.12742
M3 - Article
VL - 15
SP - 1
EP - 9
JO - Plant Biotechnology Journal
JF - Plant Biotechnology Journal
SN - 1467-7644
IS - 12
ER -