The utility of low-depth Whole Genome Sequencing (WGS)



A human sample sequenced at 0.1x depth using 150 bp reads yields 2.2 million reads corresponding to measurements at 330Mbp. At 0.4x depth, it is expected to have a single sequencing read covering each of around 28 million of the 84.7 million genetic variants identified in the 1000 Genomes Project (Auton et al., 2015), substantially higher than the output by a traditional genotyping array.


In Wasik et al. (2021), a study comparing low-depth sequencing and genotyping for trait mapping in pharmacogenetics, WGS data sequenced to an average of 1x depth using the Illumina HiSeq 4000 platform with PE150 base pair reads was down-sampled to 0.8x, 0.6x, and 0.4x. Imputation was then performed to compare with results from the Affymetrix Axiom Biobank Precision Medicine Research Array (PMRA) in terms of four metrics: overall concordance, concordance at single nucleotide polymorphisms in pharmacogenetics-related genes and concordance in imputed HLA genotypes. Both the array genotype data and the sequencing data were further used to impute HLA genotypes for all samples. Four-digit HLA alleles in the Major histocompatibility complex (MHC) region imputed from sequencing data had high concordance with those imputed from the genotyping array. Further validation using a gold-standard assay of the HLA genotype calls resulted in similar concordance results between the imputed sequence or array data and the resulting gold standards. For the purposes of trait mapping, low depth sequencing above a sequencing coverage of 0.4x had higher overall imputation accuracy as measured by imputation r2 than the genotyping array, indicating a corresponding increase in power.


In Gilly et al. (2016), it is demonstrated that even 1x and 4x depth WGS using Illumina HiSeq 2000 and Illumina HiSeq 2500, empowers the detection of rareAPOC3variant signals that can be missed by hybrid genotyping and imputation approaches, even if the imputation panel includes population-specific haplotypes. In this study, Illumina OmniExpress and ExomeChip platform data were merged and imputed up to an in-house reference panel containing the phased haplotypes of 1092 WGS samples. A WGS approach presents opportunities to enhance variant discovery (Martin et al., 2021). With a substantial sample size in this project, this approach allows for the discovery of the association between an ultra-rare variant with a large effect and a quantitative trait (Visscher et al., 2017).


Related Links:

  1. egSEQ NGS

  2. Library Preparation

  3. Next Generation Sequencing (NGS) Services


Citations:


Auton, A., Abecasis, G. R., Altshuler, D. M., Durbin, R. M., Bentley, D. R., & Chakravarti, A. A global reference for human genetic variation. Nature. 2015; 526 (7571): 68–74.


Bai, Y., Ni, M., Cooper, B., Wei, Y., & Fury, W. (2014). Inference of high resolution HLA types using genome-wide RNA or DNA sequencing reads. BMC genomics, 15(1), 1-16.


Gilly, A.,Southam, L., Suveges, D., Kuchenbaecker, K., Moore, R., Melloni, G. E., ... & Zeggini, E. (2019). Very low-depth whole-genome sequencing in complex trait association studies. Bioinformatics, 35(15), 2555-2561.


Martin, A. R., Atkinson, E. G., Chapman, S.B., Stevenson, A., Stroud, R. E., Abebe, T., ... & NeuroGAP-Psychosis Study Team. (2021). Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations. The American Journal of Human Genetics, 108(4), 656-668.


Wasik, K., Berisa, T., Pickrell, J. K., Li, J. H., Fraser, D. J., King, K., & Cox, C. (2021). Comparing low-pass sequencing and genotyping for trait mapping in pharmacogenetics. BMC genomics, 22(1), 1-7.


Visscher, P. M., Wray, N. R., Zhang, Q., Sklar, P., McCarthy, M. I., Brown, M. A., & Yang, J. (2017). 10 years of GWAS discovery: biology, function, and translation. The American Journal of Human Genetics, 101(1), 5-22.


#PopulationGenomics