Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding

Philipp E. Bayer, Babu Valliyodan, Haifei Hu, Jacob I. Marsh, Yuxuan Yuan, Tri D. Vuong, Gunvant Patil, Qijian Song, Jacqueline Batley, Rajeev K. Varshney, Hon Ming Lam, David Edwards, Henry T. Nguyen

Research output: Contribution to journalArticlepeer-review

14 Citations (Web of Science)


The gene content of plants varies between individuals of the same species due to gene presence/absence variation, and selection can alter the frequency of specific genes in a population. Selection during domestication and breeding will modify the genomic landscape, though the nature of these modifications is only understood for specific genes or on a more general level (e.g., by a loss of genetic diversity). Here we have assembled and analyzed a soybean (Glycine spp.) pangenome representing more than 1,000 soybean accessions derived from the USDA Soybean Germplasm Collection, including both wild and cultivated lineages, to assess genomewide changes in gene and allele frequency during domestication and breeding. We identified 3,765 genes that are absent from the Lee reference genome assembly and assessed the presence/absence of all genes across this population. In addition to a loss of genetic diversity, we found a significant reduction in the average number of protein-coding genes per individual during domestication and subsequent breeding, though with some genes and allelic variants increasing in frequency associated with selection for agronomic traits. This analysis provides a genomic perspective of domestication and breeding in this important oilseed crop.

Original languageEnglish
Article numbere20109
JournalPlant Genome
Issue number1
Early online date24 Jun 2021
Publication statusPublished - Mar 2022


Dive into the research topics of 'Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding'. Together they form a unique fingerprint.

Cite this