TITAN – TitanCNA output

TITAN Home | Downloads | Installation | TITANRunner Pipeline | TitanCNA R package | Output | FAQ

Summary

There are 3 outputs that can be generated directly by the TITAN R package.

1) Position-level results

  • results for every SNP in the analysis
  • usually named with extension *titan.txt

2) TITAN model parameters

  • Converged parameters estimated by TITAN
  • usually named with extension *params.txt

3) Chromosome-level plots

  • plots of copy number, LOH, and clonal clusters and cellular prevalence
  • optionally, can plot subclone profiles for runs with 2 clonal clusters

The TITANRunner pipeline also generates the segment files (using createTITANsegmentfiles.pl)

4) Segment results

  • predicted segments
  • usually named with extension *segs.txt

Detailed output format

1) Position-level results (*titan.txt)

This file is the main result output of TITAN. It contains results for each data point analyzed by TITAN and is presented as the rows of the file.
The columns are:
1) Chr
2) Position
3) RefCount – number of reads matching the reference base
4) NRefCount – number of reads matching the non-reference base
5) Depth – total read depth at the position
6) AllelicRatio – RefCount/Depth
7) LogRatio – log2 ratio between normalized tumour and normal read depths
8) CopyNumber – predicted TITAN copy number; {0,…, maxCN}, where maxCN is the specified maximum copy number for the run.
9) TITANstate* – internal state number used by TITAN; see supplementary table 2 in manuscript
10) TITANcall** – interpretable TITAN state; string {HOMD,DLOH,HET,NLOH,ALOH,ASCNA,BCNA,UBCNA}, see supplementary table 2 in manuscript
11) ClonalCluster – predicted TITAN clonal cluster; lower cluster numbers represent clusters with higher cellular prevalence; {1,…,numClusters}, where numClusters is the specified number of clonal clusters for the run.
12) CellularPrevalence – proportion of tumour cells containing event; not to be mistaken as proportion of sample (including normal)

* – The internal state number will be different if symmetric=TRUE setting is used
** – HOMD=homozygous deletion; DLOH=deletion LOH; HET=diploid heterozygous; ALOH=amplified LOH; ASCNA=allele-specific CN amplification; BCNA=balanced CN amplification; UBCNA=unbalanced CN amplification


2) Model parameters (*params.txt)

This file contains the estimated TITAN model parameters and model selection index. Each row contains information regarding different parameters:
1) Normal contamination estimate – proportion of normal content in the sample; tumour content is 1 minus this number
2) Average tumour ploidy estimate – average number of estimated copies in the genome; 2 represents diploid
3) Clonal cluster cellular prevalence – Z denotes the number of clonal clusters; each value (space-delimited) following are the cellular prevalence estimates for each cluster
4) Genotype binomial means for clonal cluster Z – set of 21 binomial estimated parameters for each specified cluster
5) Genotype Gaussian means for clonal cluster Z – set of 21 Gaussian estimated means for each specified cluster
6) Genotype Gaussian variance – set of 21 Gaussian estimated variances; variances are shared for across all clusters
7) Number of iterations – number of EM iterations needed for convergence
8) Log likelihood – complete data log-likelihood for current cluster run
9) S_Dbw dens.bw – density component of S_Dbw index
10) S_Dbw scat – scatter component of S_Dbw index
11) S_Dbw validity index – used for model selection; choose run with optimal number of clusters based on lowest S_Dbw index


3) Chromosome-level plots

There are 3 tracks generated for each plot specified by the code above. Each data point for each of the tracks represent a germline heterzygous SNP loci in the TITAN analysis.

1) Copy number alterations (log ratio)
The Y-axis is based on log ratios. Log ratios are computed ratios between normalized tumour and normal read depths. Data points close to 0 represent diploid, above 0 are copy gains, below 0 are deletions.
Bright Green – HOMD
Green – DLOH
Blue – HET, NLOH
Dark Red – GAIN
Red – ASCNA, UBCNA, BCNA

2) Loss of heterozygosity (allelic ratio)
The Y-axis is based on allelic ratios. Allelic ratios are computed as RefCount/Depth. Data points close to 1 represent homozygous reference base, close to 0 represent homozygous non-reference base, and close to 0.5 represent heterozygous. Normal contamination influences the divergence away from 0.5 for LOH events.
Grey – HET, BCNA
Bright Green – HOMD
Green – DLOH, ALOH
Blue – NLOH
Dark Red – GAIN
Red – ASCNA, UBCNA

3) Cellular prevalence and clonal clusters
The Y-axis is the cellular prevalence that includes the normal proportion. Therefore, the cellular prevalence here refers to the proportion in the sample (including normal). Lines are drawn for each data point indicating the cellular prevalence. Heterozygous diploid are not shown because it is a normal genotype and is not categorized as being subclonal (this means 100% of cells are normal).
The black horizontal line represents the tumour content labeled as “T”. Each horizontal grey line represents the cellular prevalence of the clonal clusters labeled as Z1, Z2, etc.
Colours are the same for allelic ratio plots.


4) Segment results

This file contains results for the predicted segments.  Segments are defined by consecutive SNP positions with either the same CNA/LOH genotype state or clonal cluster.
There are 14 columns:
1) Sample id
2) Chromosome
3) Start_Position(bp)
4) End_Position(bp)
5) Length(bp)
6) Median_Ratio* – median symmetric allelic ratio across the SNPs in the segment
7) Median_logR – median log ratio across the SNPs in the segment
8) TITAN_state** – internal state number used by TITAN; see supplementary table 2 in manuscript
9) TITAN_call*** – interpretable TITAN state; string {HOMD,DLOH,HET,NLOH,ALOH,ASCNA,BCNA,UBCNA}, see supplementary table 2 in manuscript
10) Copy_Number – predicted TITAN copy number; {0,…, maxCN}, where maxCN is the specified maximum copy number for the run.
11) MinorCN**** – minor copy number
12) MajorCN**** – major copy number
14) ClonalCluster – predicted TITAN clonal cluster; lower cluster numbers represent clusters with higher cellular prevalence; {1,…,numClusters}, where numClusters is the specified number of clonal clusters for the run.
15) CellularPrevalence – proportion of tumour cells containing event; not to be mistaken as proportion of sample (including normal)

* – symmetric allelic ratio is computed as max(ref count, nonRef count)/depth
** – The internal state number will be different if symmetric=TRUE setting is used
*** – HOMD=homozygous deletion; DLOH=deletion LOH; HET=diploid heterozygous; ALOH=amplified LOH; ASCNA=allele-specific CN amplification; BCNA=balanced CN amplification; UBCNA=unbalanced CN amplification
**** – minor and major copy number is determined by the copy number genotype
For example:
Copy_Number=6 and TITAN_call=ASCNA, then the genotype is AAAAAB or ABBBBB which has MinorCN=1 and MajorCN=5.