Calls simple duplications (dup) and pyrgo, which are clusters or "towers" of overlapping duplications

Duplications are defined as having low junction copy number and connect two nodes of low junction copy number (with cn and jcn thresholds provided as parameters). Simple duplications have no other overlapping junctions in their shadow. Pyrgo have a min.count and are also outliers (<fdr.thresh) in a negative binomial modup that incorporates the regional (non-DUP) junction count in tile.width genomic bins (set to 1 Mbp by default).

Note: Not all DUP-like junctions will be called a dup or pyrgo.

dup(
  gg,
  fdr.thresh = 0.5,
  tile.width = 1e+06,
  jcn.thresh = 1,
  min.count = 2,
  return.fish = FALSE,
  mark = FALSE,
  mark.col = "purple",
  min.width = 10000,
  max.width.flank = 10000,
  max.width = 1e+07
)

Arguments

gg

gGraph with $cn field annotated on nodes and edges

fdr.thresh

False discovery rate threshold for fishook calculated events. Default: 0.5

tile.width

bin width to use when computing duplication clustering Default: 1e6

jcn.thresh

edge copy number threshold to call a duplication in ploidy units. Default: 1

min.count

minimum number of overlapping duplications to constitute a pyrgo, includes the tile.width. Default 2

return.fish

parameter to return FishHook::Fish() output. Default: False

mark

color duplication events. Default: False

mark.col

color of duplication events. Default: purple

min.width

min width of duplications to consider for pyrgo. Default: 1e4

max.width.flank

max width flank each side of class DEL-like to consider. Default: 1e4

max.width

max width of duplications to consider for pyrgo. Default: 1e7

Value

gGraph with nodes and edges annotated with $dup and $pyrgo metadata field, and data.tables $meta$pyrgo and $meta$set with event level statistics.

If return.fish = TRUE, returns FishHook::Fish() output.

Details

For more details on how to run the function and examples: Duplications & Pyrgo