Sorry for the slow response, I've been out of town.
Short answer: That input is optional and may not be relevant to all cancers, so you may be fine without it.
Long answer: I believe we take the frequencies of germline SNPs in the tumor genome and look for large regions where there are no het sites, caused by CN-neutral LOH (aka UPD). You can segment them into discrete regions using CBS, as implemented in the DNAcopy package for R. Plotting on a per-chromosome basis should also make them clearly evident, if they exist.
- Chris Miller
RT @BorowitzReport: I worry that the US embracing soccer will lead to other things Europe does, like recognizing climate change and offering maternity leave.
bc it makes too much sense: "Why not share the cost of paying for roads in proportion to the usage made of roadways?" http://news.stlpublicradio.org/post...
We built the sciClone package for exactly this purpose: https://github.com/genome...
It takes inputs of somatic mutations, with readcounts and VAFs, and uses that information to infer subclonal populations in heterogeneous tumors. It also gives you some nice visualization options.
- Chris Miller
Well played, sir. In retrospect, doing the experiment myself would have been faster than that 10 minutes of searching and typing this post. Thanks!
- Chris Miller
If I take a bam that's already aligned and has dups marked, remove a bunch of reads, then re-run Picard's mark-duplicates, will it correctly change the flags of reads that are no longer duplicates (but may have been before ditching the reads)?
This question is proving surprisingly hard to google for.
- Chris Miller