An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations

Bernardo J. Clavijo, Luca Venturini, Christian Schudoma, Gonzalo Garcia Accinelli, Gemy Kaithakottil, Jonathan Wright, Philippa Borrill, George Kettleborough, Darren Heavens, Helen Chapman, James Lipscombe, Tom Barker, Fu Hao Lu, Neil McKenzie, Dina Raats, Ricardo H. Ramirez-Gonzalez, Aurore Coince, Ned Peel, Lawrence Percival-Alwyn, Owen Duncan & 19 others Josua Trösch, Guotai Yu, Dan M. Bolser, Guy Namaati, Arnaud Kerhornou, Manuel Spannagl, Heidrun Gundlach, Georg Haberer, Robert P. Davey, Christine Fosker, Federica Di Palma, Andrew L. Phillips, A. Harvey Millar, Paul J. Kersey, Cristobal Uauy, Ksenia V. Krasileva, David Swarbreck, Michael W. Bevan, Matthew D. Clark

Research output: Contribution to journalArticle

122 Citations (Scopus)

Abstract

Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.

Original languageEnglish
Pages (from-to)885-896
Number of pages12
JournalGenome Research
Volume27
Issue number5
DOIs
Publication statusPublished - 1 May 2017

Fingerprint

Genetic Translocation
Triticum
Genome
Genes
Genomic Structural Variation
Untranslated RNA
Polyploidy
Bread
Firearms
Breeding
Complementary DNA
RNA
Technology

Cite this

Clavijo, Bernardo J. ; Venturini, Luca ; Schudoma, Christian ; Accinelli, Gonzalo Garcia ; Kaithakottil, Gemy ; Wright, Jonathan ; Borrill, Philippa ; Kettleborough, George ; Heavens, Darren ; Chapman, Helen ; Lipscombe, James ; Barker, Tom ; Lu, Fu Hao ; McKenzie, Neil ; Raats, Dina ; Ramirez-Gonzalez, Ricardo H. ; Coince, Aurore ; Peel, Ned ; Percival-Alwyn, Lawrence ; Duncan, Owen ; Trösch, Josua ; Yu, Guotai ; Bolser, Dan M. ; Namaati, Guy ; Kerhornou, Arnaud ; Spannagl, Manuel ; Gundlach, Heidrun ; Haberer, Georg ; Davey, Robert P. ; Fosker, Christine ; Di Palma, Federica ; Phillips, Andrew L. ; Millar, A. Harvey ; Kersey, Paul J. ; Uauy, Cristobal ; Krasileva, Ksenia V. ; Swarbreck, David ; Bevan, Michael W. ; Clark, Matthew D. / An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. In: Genome Research. 2017 ; Vol. 27, No. 5. pp. 885-896.
@article{626272a2a0654a43b0591ed9dbc94806,
title = "An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations",
abstract = "Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78{\%} of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.",
author = "Clavijo, {Bernardo J.} and Luca Venturini and Christian Schudoma and Accinelli, {Gonzalo Garcia} and Gemy Kaithakottil and Jonathan Wright and Philippa Borrill and George Kettleborough and Darren Heavens and Helen Chapman and James Lipscombe and Tom Barker and Lu, {Fu Hao} and Neil McKenzie and Dina Raats and Ramirez-Gonzalez, {Ricardo H.} and Aurore Coince and Ned Peel and Lawrence Percival-Alwyn and Owen Duncan and Josua Tr{\"o}sch and Guotai Yu and Bolser, {Dan M.} and Guy Namaati and Arnaud Kerhornou and Manuel Spannagl and Heidrun Gundlach and Georg Haberer and Davey, {Robert P.} and Christine Fosker and {Di Palma}, Federica and Phillips, {Andrew L.} and Millar, {A. Harvey} and Kersey, {Paul J.} and Cristobal Uauy and Krasileva, {Ksenia V.} and David Swarbreck and Bevan, {Michael W.} and Clark, {Matthew D.}",
year = "2017",
month = "5",
day = "1",
doi = "10.1101/gr.217117.116",
language = "English",
volume = "27",
pages = "885--896",
journal = "Genome Research",
issn = "1054-9803",
publisher = "Cold Spring Harbor Laboratory Press",
number = "5",

}

Clavijo, BJ, Venturini, L, Schudoma, C, Accinelli, GG, Kaithakottil, G, Wright, J, Borrill, P, Kettleborough, G, Heavens, D, Chapman, H, Lipscombe, J, Barker, T, Lu, FH, McKenzie, N, Raats, D, Ramirez-Gonzalez, RH, Coince, A, Peel, N, Percival-Alwyn, L, Duncan, O, Trösch, J, Yu, G, Bolser, DM, Namaati, G, Kerhornou, A, Spannagl, M, Gundlach, H, Haberer, G, Davey, RP, Fosker, C, Di Palma, F, Phillips, AL, Millar, AH, Kersey, PJ, Uauy, C, Krasileva, KV, Swarbreck, D, Bevan, MW & Clark, MD 2017, 'An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations' Genome Research, vol. 27, no. 5, pp. 885-896. https://doi.org/10.1101/gr.217117.116

An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. / Clavijo, Bernardo J.; Venturini, Luca; Schudoma, Christian; Accinelli, Gonzalo Garcia; Kaithakottil, Gemy; Wright, Jonathan; Borrill, Philippa; Kettleborough, George; Heavens, Darren; Chapman, Helen; Lipscombe, James; Barker, Tom; Lu, Fu Hao; McKenzie, Neil; Raats, Dina; Ramirez-Gonzalez, Ricardo H.; Coince, Aurore; Peel, Ned; Percival-Alwyn, Lawrence; Duncan, Owen; Trösch, Josua; Yu, Guotai; Bolser, Dan M.; Namaati, Guy; Kerhornou, Arnaud; Spannagl, Manuel; Gundlach, Heidrun; Haberer, Georg; Davey, Robert P.; Fosker, Christine; Di Palma, Federica; Phillips, Andrew L.; Millar, A. Harvey; Kersey, Paul J.; Uauy, Cristobal; Krasileva, Ksenia V.; Swarbreck, David; Bevan, Michael W.; Clark, Matthew D.

In: Genome Research, Vol. 27, No. 5, 01.05.2017, p. 885-896.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations

AU - Clavijo, Bernardo J.

AU - Venturini, Luca

AU - Schudoma, Christian

AU - Accinelli, Gonzalo Garcia

AU - Kaithakottil, Gemy

AU - Wright, Jonathan

AU - Borrill, Philippa

AU - Kettleborough, George

AU - Heavens, Darren

AU - Chapman, Helen

AU - Lipscombe, James

AU - Barker, Tom

AU - Lu, Fu Hao

AU - McKenzie, Neil

AU - Raats, Dina

AU - Ramirez-Gonzalez, Ricardo H.

AU - Coince, Aurore

AU - Peel, Ned

AU - Percival-Alwyn, Lawrence

AU - Duncan, Owen

AU - Trösch, Josua

AU - Yu, Guotai

AU - Bolser, Dan M.

AU - Namaati, Guy

AU - Kerhornou, Arnaud

AU - Spannagl, Manuel

AU - Gundlach, Heidrun

AU - Haberer, Georg

AU - Davey, Robert P.

AU - Fosker, Christine

AU - Di Palma, Federica

AU - Phillips, Andrew L.

AU - Millar, A. Harvey

AU - Kersey, Paul J.

AU - Uauy, Cristobal

AU - Krasileva, Ksenia V.

AU - Swarbreck, David

AU - Bevan, Michael W.

AU - Clark, Matthew D.

PY - 2017/5/1

Y1 - 2017/5/1

N2 - Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.

AB - Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.

UR - http://www.scopus.com/inward/record.url?scp=85019139117&partnerID=8YFLogxK

U2 - 10.1101/gr.217117.116

DO - 10.1101/gr.217117.116

M3 - Article

VL - 27

SP - 885

EP - 896

JO - Genome Research

JF - Genome Research

SN - 1054-9803

IS - 5

ER -