TY - JOUR
T1 - SNP discovery using a pangenome
T2 - Has the single reference approach become obsolete?
AU - Hurgobin, Bhavna
AU - Edwards, Dave
PY - 2017/3/1
Y1 - 2017/3/1
N2 - Increasing evidence suggests that a single individual is insufficient to capture the genetic diversity within a species due to gene presence absence variation. In order to understand the extent to which genomic variation occurs in a species, the construction of its pangenome is necessary. The pangenome represents the complete set of genes of a species, it is composed of core genes, which are present in all individuals, and variable genes, which are present only in some individuals. Aside from variations at the gene level, single nucleotide polymorphisms (SNPs) are also an important form of genetic variation. The advent of next-generation sequencing (NGS) coupled with the heritability of SNPs make them ideal markers for genetic analysis of human, animal, and microbial data. SNPs have also been extensively used in crop genetics for association mapping, quantitative trait loci (QTL) analysis, analysis of genetic diversity, and phylogenetic analysis. This review focuses on the use of pangenomes for SNP discovery. It highlights the advantages of using a pangenome rather than a single reference for this purpose. This review also demonstrates how extra information not captured in a single reference alone can be used to provide additional support for linking genotypic data to phenotypic data.
AB - Increasing evidence suggests that a single individual is insufficient to capture the genetic diversity within a species due to gene presence absence variation. In order to understand the extent to which genomic variation occurs in a species, the construction of its pangenome is necessary. The pangenome represents the complete set of genes of a species, it is composed of core genes, which are present in all individuals, and variable genes, which are present only in some individuals. Aside from variations at the gene level, single nucleotide polymorphisms (SNPs) are also an important form of genetic variation. The advent of next-generation sequencing (NGS) coupled with the heritability of SNPs make them ideal markers for genetic analysis of human, animal, and microbial data. SNPs have also been extensively used in crop genetics for association mapping, quantitative trait loci (QTL) analysis, analysis of genetic diversity, and phylogenetic analysis. This review focuses on the use of pangenomes for SNP discovery. It highlights the advantages of using a pangenome rather than a single reference for this purpose. This review also demonstrates how extra information not captured in a single reference alone can be used to provide additional support for linking genotypic data to phenotypic data.
KW - Assembly
KW - Copy number variation
KW - Core genome
KW - Gene
KW - Genetic diversity
KW - Pangenome
KW - Presence absence variation
KW - Single nucleotide polymorphism
KW - SNP discovery
KW - Variable genome
UR - http://www.scopus.com/inward/record.url?scp=85015316200&partnerID=8YFLogxK
U2 - 10.3390/biology6010021
DO - 10.3390/biology6010021
M3 - Review article
C2 - 28287462
AN - SCOPUS:85015316200
SN - 2079-7737
VL - 6
JO - Biology
JF - Biology
IS - 1
M1 - 21
ER -