TY - JOUR
T1 - Subgenomic RNA identification in SARS-CoV-2 genomic sequencing data
AU - COVID-19 Genomics UK (COG-UK) Consortium
AU - Parker, Matthew D.
AU - Lindsey, Benjamin B.
AU - Leary, Shay
AU - Gaudieri, Silvana
AU - Chopra, Abha
AU - Wyles, Matthew
AU - Angyal, Adrienn
AU - Green, Luke R.
AU - Parsons, Paul
AU - Tucker, Rachel M.
AU - Brown, Rebecca
AU - Groves, Danielle
AU - Johnson, Katie
AU - Carrilero, Laura
AU - Heffer, Joe
AU - Partridge, David G.
AU - Evans, Cariad
AU - Raza, Mohammad
AU - Keeley, Alexander J.
AU - Smith, Nikki
AU - Filipe, Ana Da Silva
AU - Shepherd, James G.
AU - Davis, Chris
AU - Bennett, Sahan
AU - Sreenu, Vattipally B.
AU - Kohl, Alain
AU - Aranday-Cortes, Elihu
AU - Tong, Lily
AU - Nichols, Jenna
AU - Thomson, Emma C.
AU - Wang, Dennis
AU - Mallal, Simon
AU - de Silva, Thushan I.
PY - 2021/4/1
Y1 - 2021/4/1
N2 - We have developed periscope, a tool for the detection and quantification of subgenomic RNA (sgRNA) in SARS-CoV-2 genomic sequence data. The translation of the SARS-CoV-2 RNA genome for most open reading frames (ORFs) occurs via RNA intermediates termed "subgenomic RNAs." sgRNAs are produced through discontinuous transcription, which relies on homology between transcription regulatory sequences (TRS-B) upstream of the ORF start codons and that of the TRS-L, which is located in the 5' UTR. TRS-L is immediately preceded by a leader sequence. This leader sequence is therefore found at the 5' end of all sgRNA. We applied periscope to 1155 SARS-CoV-2 genomes from Sheffield, United Kingdom, and validated our findings using orthogonal data sets and in vitro cell systems. By using a simple local alignment to detect reads that contain the leader sequence, we were able to identify and quantify reads arising from canonical and noncanonical sgRNA. We were able to detect all canonical sgRNAs at the expected abundances, with the exception of ORF10. A number of recurrent noncanonical sgRNAs are detected. We show that the results are reproducible using technical replicates and determine the optimum number of reads for sgRNA analysis. In VeroE6 ACE2+/- cell lines, periscope can detect the changes in the kinetics of sgRNA in orthogonal sequencing data sets. Finally, variants found in genomic RNA are transmitted to sgRNAs with high fidelity in most cases. This tool can be applied to all sequenced COVID-19 samples worldwide to provide comprehensive analysis of SARS-CoV-2 sgRNA.
AB - We have developed periscope, a tool for the detection and quantification of subgenomic RNA (sgRNA) in SARS-CoV-2 genomic sequence data. The translation of the SARS-CoV-2 RNA genome for most open reading frames (ORFs) occurs via RNA intermediates termed "subgenomic RNAs." sgRNAs are produced through discontinuous transcription, which relies on homology between transcription regulatory sequences (TRS-B) upstream of the ORF start codons and that of the TRS-L, which is located in the 5' UTR. TRS-L is immediately preceded by a leader sequence. This leader sequence is therefore found at the 5' end of all sgRNA. We applied periscope to 1155 SARS-CoV-2 genomes from Sheffield, United Kingdom, and validated our findings using orthogonal data sets and in vitro cell systems. By using a simple local alignment to detect reads that contain the leader sequence, we were able to identify and quantify reads arising from canonical and noncanonical sgRNA. We were able to detect all canonical sgRNAs at the expected abundances, with the exception of ORF10. A number of recurrent noncanonical sgRNAs are detected. We show that the results are reproducible using technical replicates and determine the optimum number of reads for sgRNA analysis. In VeroE6 ACE2+/- cell lines, periscope can detect the changes in the kinetics of sgRNA in orthogonal sequencing data sets. Finally, variants found in genomic RNA are transmitted to sgRNAs with high fidelity in most cases. This tool can be applied to all sequenced COVID-19 samples worldwide to provide comprehensive analysis of SARS-CoV-2 sgRNA.
UR - http://www.scopus.com/inward/record.url?scp=85103802904&partnerID=8YFLogxK
U2 - 10.1101/gr.268110.120
DO - 10.1101/gr.268110.120
M3 - Article
C2 - 33722935
AN - SCOPUS:85103802904
SN - 1054-9803
VL - 31
SP - 645
EP - 658
JO - Genome Research
JF - Genome Research
IS - 4
ER -