TITAN – Frequently Asked Questions

TITAN Home | Downloads | Installation | TITANRunner Pipeline | TitanCNA R package | Output | FAQ


1. The S_Dbw validity index is not selecting the correct run. Is there something I can tweak?
2. How can I tell if the run with the optimal number of clonal clusters is selected?
3. Why are my copy number state predictions reversed?


1. The S_Dbw validity index is not selecting the correct run. Is there something I can tweak?

Yes, users can adjust the S_Dbw validity index computation to select alternative TITAN solutions with the optimal number of clonal clusters. This is because some cancer types may have more copy number events than others. The index is computed based on how well defined the events are across clonal clusters and copy number states.

The S_Dbw validity index is returned from the output to the function outputModelParameters(). Here’s an example of the last 3 lines in this output file:

S_Dbw dens.bw: 0.0508
S_Dbw scat: 0.3859
S_Dbw validity index: 1.6567

The formula for this is as follows:

S_Dbw validity index = weight * dens.bw + scat

where weight = 25

The weighting on dens.bw penalizes runs with higher number of clonal clusters. The higher weighting encourages more distinct cluster separation. However, this weighting may not always be appropriate for your dataset. Therefore, users may need to tweak this weighting on their own by recomputing the above equation using another weighting.

 

In general, the higher the weighting value, the optimal run will have lower number of clonal clusters. The lower the weighting, the optimal run will have higher number clonal clusters.

Note in v1.3.1, the default method to compute the S_Dbw uses the Tong et al. method (see Installation for update details). For this version and higher, the default is weight = 1.

Users do not need to re-run TitanCNA to recompute the the S_Dbw validity index using a different weight. Using the formula above is the easiest. However, users can choose to specify a different weight directly in the outputModelParameters function using the S_Dbw.scale argument.

outputModelParameters(convergeParams,results,outparam, S_Dbw.scale = 1, S_Dbw.method = "Tong")

2. How can I tell if the run with the optimal number of clonal clusters is selected?

While TitanCNA provides an approach to determine the run with the optimal number of clonal clusters, it will require some manual inspection to confirm. Users may have to adjust the S_Dbw validity index computation depending on their datasets (see FAQ #1).

Here is one way to tell if the S_Dbw index is likely incorrectly chosen.

a) Problem: Over-fitting to higher number of clonal clusters.

b) Observation: Small number of data points fall in clonal cluster 1 (Z1; clonally dominant prevalence) across MANY of the chromosomes.

For example,

Left: Over-fitting to 2 clonal clusters

Right: Not over-fitting for 1 clonal cluster

Note that dramatic differences in normal contamination (and sometimes ploidy) can be an indication of improper optimal run selection. Manual inspection is encouraged when this happens.

SA299_cluster02_chr16 SA299_cluster01_chr16

c) Solution: Use the run with 1 fewer clonal cluster. Repeat this inspection and continue choosing the run with fewer clusters until over-fitting is not observed.

3. Why are my copy number state predictions reversed?

Sometimes, the copy number states may be reversed, aka label switching.  There are 2 reasons/solutions.

a. The prior (Gaussian) variance for copy number (log ratio) is set with too high of a precision.
Relax this using the following lines of code prior to running runEMclonalCN.

K <- length(params$genotypeParams$rt) #get number of states
params$genotypeParams$alphaKHyper <- rep(12000,K) #use values smaller than 12000 if need to relax further.

b. The maximum number of copies for the TITAN analysis is not higher enough.
If users use 5 maximum copies in the analysis but encounter events that can have more than 5 copies, then TITAN may mistakenly call these high-level amplifications incorrectly.
Try using up to 8 maximum copies in your analysis. Currently, 8 is the highest number of copies that TITAN can analyze.

params <- loadDefaultParameters(copyNumber = 8, numberClonalClusters = numClusters)


4. My sample has low tumour content (cellularity). What should I expect from my results?

Coming soon…


5. How should I run TitanCNA for polyploid genomes? What if I do not know the ploidy in advance?

Coming soon…


6. Why are there so many large regions of homozygous deletions in my results?

Coming soon…