Gavin Ha

Research interests:1240828460.Gavin_Ha
structural variation and copy number analysis, cancer genomics, machine learning, next-generation sequencing and microarray analysis, bioinformatics

Breast cancers are classified into distinct subtypes using clinically established biomarkers. In addition to distinguishing between established clinical subtypes, tumours from different patients of the same cancer subtype can have diverse and variant genetic properties, leading to further sub-classification. The impact of this heterogeneity across individuals can, in turn, have varying effects on therapeutic resistance, clinical outcome, and metastasis.
The heterogeneity of breast cancer may be explained through distinct somatic genomic aberrations that contribute to the phenotype. Genomic aberrations can include single nucleotide variants (SNVs), indels, and rearrangement events that alter genomic structure. The latter type of mutations can include deletions, insertions, inversions, translocations, fusions, duplications, variable number tandem-repeats, target site duplications. Loosely, these events are classified as structural variants, many of which can also alter dosage, resulting in copy number alterations (CNAs). These events are of high interest because many potential oncogenes and tumour suppressor genes may be affected within or around these surrounding genomic regions.
High-throughput cancer genome data is obtained via high-density genotyping arrays and massively parallel re-sequencing technology (also termed next-generation sequencing). The latter is a state-of-the-art, cutting-edge nucleotide re-sequencing technology capable of producing millions of short-length reads at single nucleotide resolution. The advantage of such resolutions can enable more accurate detection and analysis of genomic instability when surveying and profiling structural variations.
My research focus is to survey and detect somatic structural variation and CNA events in cancer genomes using statistical approaches and machine learing algorithms. This can help enable us to answer the following questions:

  1. What are the recurrent somatic mutations found in a particular breast tumour subtype?
  2. What are the finer sub-classifications of this subtype?
  3. What are the molecular features used to distinguish between the finer sub-classes?

Projects:
I am involved in the METABRIC project in which I contribute bioinformatics analyses for high-density genotyping array data of over 2000 breast tumour samples.  Specifically, I am using probabilistic models to infer copy number alterations (CNA) in breast tumour samples and subsequently clustering these molecular features to discover novel subtypes.

Software:
I am developing an approach to distinguish germline and somatic copy number events in SNP genotyping data. You can download the Matlab implementation and beta release from the software page.

Education & Past Experience:
I am a currently a graduate student in the CIHR Bioinformatics Training Program at the University of British Columbia. I am stationed at the Centre for Translational and Applied Genomics (CTAG) and BC Cancer Research Centre under the supervison of Dr. Samuel Aparicio and Dr. Sohrab Shah. I completed my undergraduate degree in May 2008 at UBC with a combined major in Computer Science and Microbiology/Immunology. In September 2008, I started my graduate training in bioinformatics funded by the CIHR.

Contact:
Name: Gavin Ha
E-mail: gha [at] bccrc [dot] ca
Telephone: 1-604-877-6000 x2140
Centre for Translational and Applied Genomics
BC Cancer Agency
Vancouver, BC, Canada