Whole Genome Sequencing (WGS) is the most comprehensive method for analyzing an individual’s complete DNA sequence. It decodes 100% of the genome, including coding (exons), non-coding (introns, promoters, enhancers), repetitive regions, and regulatory elements, enabling a full view of genetic variation.
WGS captures single nucleotide variants (SNVs), insertions/deletions (indels), copy number variations (CNVs), structural variants (SVs), and even mitochondrial DNA mutations — making it a powerful tool for both clinical diagnostics and research discovery.
Purpose: Decode the entire genome to identify all types of genetic variation
Target Area: Entire nuclear and mitochondrial genome (~3.2 billion base pairs)
Applications: Rare disease diagnosis, cancer genomics, population genetics, infectious disease tracking, pharmacogenomics
Output: High-resolution map of all known and novel genomic variants
Sample Collection & DNA Extraction
Genomic DNA is extracted from biological samples like blood, saliva, or tissue.
Library Preparation
DNA is fragmented and sequencing adapters are ligated to both ends.
Sequencing
Libraries are sequenced using high-throughput platforms such as Illumina, MGI, Oxford Nanopore (ONT), or PacBio.
Data Processing & Alignment
Raw reads are aligned to a reference genome (e.g., GRCh38) using tools like BWA or Minimap2.
Variant Calling & Annotation
Software pipelines detect SNPs, indels, CNVs, and structural variants. These are annotated using databases such as ClinVar, gnomAD, OMIM, etc.
Rare Disease Diagnosis
Detects causative variants even outside coding regions when other methods fail.
Cancer Genomics
Identifies somatic mutations, structural rearrangements, and tumor mutational burden.
Prenatal & Neonatal Screening
Non-invasive or early detection of genetic abnormalities.
Infectious Disease Genomics
Tracks viral evolution (e.g., SARS-CoV-2), antimicrobial resistance, or pathogen outbreaks.
Population & Evolutionary Genomics
Studies human variation, ancestry, and natural selection patterns.
Pharmacogenomics
Evaluates how genes affect drug response and metabolism.
Gene Discovery & Functional Genomics
Links phenotype to genotype across regulatory and structural regions.
Complete Coverage
Captures coding, non-coding, regulatory, and repetitive DNA.
High Resolution
Detects rare, common, and novel variants with base-pair accuracy.
Unbiased
Does not rely on gene panels or capture probes; ideal for discovering unknown variants.
Structural Variation Detection
Identifies large deletions, duplications, inversions, and translocations.
Mitochondrial Sequencing
Captures both nuclear and mitochondrial genomes in one workflow.
Future-Proof
Once sequenced, the data can be reanalyzed as new knowledge or tools emerge.
Feature | Description |
---|---|
Coverage | Typically 30× for germline, 100×+ for somatic/cancer WGS |
Genome Size | ~3.2 billion base pairs (human genome) |
Read Types | Short-read (Illumina), Long-read (PacBio, ONT) |
Variant Types Detected | SNPs, indels, CNVs, SVs, tandem repeats, mitochondrial variants |
Turnaround Time | 2–8 weeks depending on pipeline and platform |
Data Size | 90–150 GB per genome (FASTQ), compressed BAM/VCF available |
Cost
Though decreasing, WGS is more expensive than targeted methods.
Data Analysis Complexity
Requires high computational power and expert bioinformatics.
Variant Interpretation
Many variants are of uncertain significance, especially in non-coding regions.
Incidental Findings
May reveal unexpected or unrelated genetic information (e.g., predisposition to disease).
Ethical and Privacy Concerns
WGS data is sensitive and must be stored and handled with strict confidentiality.