skip to main content

“Deep” Sequencing Accuracy and Reproducibility using Roche/454 technology for Inferring Co-Receptor Usage in HIV-1

Identifiers: SRA: ERP005461
BioProject: PRJEB6005
CFENET-UBC: ena-STUDY-CFENET-UBC-28-03-2014-19:44:20:612-1043
Study Type: 
Population Genomics
Abstract: Next generation, “deep”, sequencing has increasing applications both clinically and in disparate fields of research. This study investigates the accuracy and reproducibility of “deep” sequencing as applied to co-receptor prediction using the V3 loop of Human Immunodeficiency Virus-1. Despite increasing use in HIV co-receptor prediction, the accuracy and reproducibility of deep sequencing technology, and the factors which can affect it, have received only a limited level of investigation. To accomplish this, repeated deep sequencing results were generated using the Roche GS-FLX (454) from a number of sources including a non-homogeneous clinical sample (N=47 replicates over 18 deep sequencing runs), and a large clinical cohort from the MOTIVATE and A400129 studies (N=1521). For repeated measurements of a non-homogeneous clinical sample, increasing input copy number both decreased variance in the measured proportion of non-R5 using virus (p<<0.001 and 0.02 for single replicates and triplicates respectively) and increased measured viral diversity (p<0.001; multiple measures). Detection of sequences with a mean abundance less than 1% abundance showed a 2 fold increase in median coefficient of variation (CV) in repeated measurements of a non-homogeneous clinical sample, and a 2.7 fold increase in CV in the MOTIVATE/ A400129 dataset compared to sequences with =1% abundance. An unexpected source of error included read position, with low accuracy reads occurring more frequently towards the edge of sequencing regions (p<<0.001). Overall, the primary source of variability was sampling error caused by low input copy number/minority species prevalence, though other sources of error including sequence intrinsic, temporal, and read-position related errors were detected.
Description: Next generation, “deep”, sequencing has increasing applications both clinically and in disparate fields of research. This study investigates the accuracy and reproducibility of “deep” sequencing as applied to co-receptor prediction using the V3 loop of Human Immunodeficiency Virus-1. Despite increasing use in HIV co-receptor prediction, the accuracy and reproducibility of deep sequencing technology, and the factors which can affect it, have received only a limited level of investigation. To accomplish this, repeated deep sequencing results were generated using the Roche GS-FLX (454) from a number of sources including a non-homogeneous clinical sample (N=47 replicates over 18 deep sequencing runs), and a large clinical cohort from the MOTIVATE and A400129 studies (N=1521). For repeated measurements of a non-homogeneous clinical sample, increasing input copy number both decreased variance in the measured proportion of non-R5 using virus (p<<0.001 and 0.02 for single replicates and triplicates respectively) and increased measured viral diversity (p<0.001; multiple measures). Detection of sequences with a mean abundance less than 1% abundance showed a 2 fold increase in median coefficient of variation (CV) in repeated measurements of a non-homogeneous clinical sample, and a 2.7 fold increase in CV in the MOTIVATE/ A400129 dataset compared to sequences with =1% abundance. An unexpected source of error included read position, with low accuracy reads occurring more frequently towards the edge of sequencing regions (p<<0.001). Overall, the primary source of variability was sampling error caused by low input copy number/minority species prevalence, though other sources of error including sequence intrinsic, temporal, and read-position related errors were detected.

Related SRA data

Experiments:
48 ( 48 samples )
Runs:
48 (49.8Mbp; 26.5Mb)