geneprint

Whole Exome Seq

What is Whole Exome Sequencing (WES)?

Whole Exome Sequencing (WES) is a next-generation sequencing (NGS) technique that targets and sequences all protein-coding regions (exons) of the genome, collectively known as the exome. While exons make up only 1–2% of the human genome, they contain about 85% of known disease-related mutations.

WES offers a cost-effective and powerful method for identifying genetic variants associated with inherited diseases, cancer, and complex traits.

Overview

  • Purpose: Identify mutations in coding regions of genes

  • Target Regions: All known protein-coding exons (~20,000 genes)

  • Output: Variants (SNPs, indels) in exons and splice sites

  • Applications: Rare disease diagnosis, cancer genomics, pharmacogenomics, personalized medicine

How Whole Exome Sequencing Works

  1. DNA Extraction
    Genomic DNA is isolated from blood, saliva, or tissue samples.

  2. Library Preparation
    DNA is fragmented and adapters are added for sequencing.

  3. Exome Capture (Enrichment)
    Biotinylated probes hybridize specifically to exon regions. These regions are pulled down using magnetic beads.

  4. Amplification & Sequencing
    Captured fragments are PCR-amplified and sequenced using high-throughput platforms like Illumina or MGI.

  5. Data Analysis
    Raw reads are mapped to the reference genome, and variant calling identifies single nucleotide variants (SNVs), insertions, deletions, and splice site mutations.

Applications of Whole Exome Sequencing

  • Rare Disease Diagnosis
    Detects inherited mutations responsible for Mendelian disorders and undiagnosed conditions.

  • Cancer Genomics
    Identifies somatic mutations in tumor DNA that drive cancer development.

  • Prenatal & Pediatric Testing
    Screens for genetic defects in fetuses or children with developmental disorders.

  • Neurogenetic Disorders
    Investigates causes of epilepsy, autism, intellectual disability, and neurodegeneration.

  • Pharmacogenomics
    Determines genetic variants that affect drug metabolism and response.

  • Carrier Screening
    Identifies individuals who carry mutations that could be passed to offspring.

Advantages of Whole Exome Sequencing

  • High Clinical Yield
    Focuses on regions most likely to harbor pathogenic variants.

  • Cost-Effective
    Cheaper than Whole Genome Sequencing with sufficient clinical utility.

  • Efficient Analysis
    Smaller data size makes analysis and interpretation faster and easier.

  • Supports Novel Variant Discovery
    Not limited to known mutations like targeted panels.

  • Customizable Coverage
    Capture kits can be tailored to specific gene sets or updated databases.

Key Features of WES

FeatureDescription
Target Size~30–50 Mb (1–2% of the genome)
Variant Types DetectedSNPs, insertions, deletions, splice site mutations
Read DepthTypically 100× or higher for accurate variant calling
Turnaround Time2–6 weeks depending on pipeline
Databases for AnnotationClinVar, OMIM, HGMD, dbSNP, gnomAD
Bioinformatics PipelinesGATK, VarDict, BWA, Annovar, VEP, DeepVariant

Limitations and Challenges

  • Does Not Cover Non-Coding Regions
    Misses regulatory variants in promoters, enhancers, introns, and UTRs.

  • Incomplete Coverage
    Some exons may have low or no coverage due to poor capture efficiency.

  • Misses Structural Variants
    Large rearrangements, CNVs, and repeat expansions are often undetected.

  • Interpretation Complexity
    Variant interpretation requires clinical correlation and expert curation.

  • False Positives/Negatives
    Errors in mapping or calling can occur, especially in GC-rich or repetitive regions.

Popular Tools and Pipelines for WES Analysis

  • QC & Trimming: FastQC, Trimmomatic

  • Alignment: BWA-MEM, Bowtie2

  • Variant Calling: GATK HaplotypeCaller, FreeBayes, DeepVariant

  • Annotation: Annovar, VEP, SnpEff

  • Visualization: IGV, UCSC Genome Browser

  • Interpretation: ClinVar, OMIM, HGMD, InterVar