Parsing various formats of structural variation data into junctions.

jJ(rafile,
keep.features = T,
seqlengths = NULL,
chr.convert = T,
geno=NULL,
flipstrand = FALSE,
swap.header = NULL,
breakpointer = FALSE,
seqlevels = NULL,
force.bnd = FALSE,
skip = NA)

Arguments

rafile

path to the junctions file. See details for the compatible formats.

keep.features

logical, if TRUE preserve meta data from the input

seqlengths

a named numeric vector containing reference contig lengths

chr.convert

logical, if TRUE strip "chr" prefix from contig names

geno

logical, whether to parse the 'geno' fields of VCF

hg

character, human genome version

flipstrand

logical, if TRUE will flip breakpoint strand

swap.header

path to the alternative VCF header file

breakpointer

logical, if TRUE will parse as breakpointer output

seqlevels

vector for renaming the chromosomes

force.bnd

if TRUE overwrite all junction "type" to "BND"

skip

numeric lines to skip

Value

a Junction object

Details

A junction is a unordered pair of strand-specific genomic locations (breakpoints). Within a given reference genome coordinate system, we call the direction in which coordinates increase "+". A breakpoint is a width 1 (start==end)genomic range with strand specified, and "+" means the side with larger coordinate is fused with the other breakpoint in a junction.

rafile must be one of the following formats: 1) Some VCF (variant call format). We currently support the VCF output from a number of structural variation detection methods, namely SvABA (https://github.com/walaj/svaba), DELLY (https://github.com/dellytools/delly), LUMPY (https://github.com/arq5x/lumpy-sv), novoBreak (https://sourceforge.net/projects/novobreak/). In theory, VCF defined with BND style should be compatible but be cautious when using the output from other methods since no universal data definition is adopted by the community yet. 2) BEDPE (http://bedtools.readthedocs.io/en/latest/content/general-usage.html#bedpe-format) 3) Textual output from Breakpointer (http://archive.broadinstitute.org/cancer/cga/breakpointer) 4) R serialized object storing junctions (.rds)

Warning

We assume the orientation definition in the input is consistent with ours. Check with the documentation of your respective method to make sure. If the contrary, use flipstrand=TRUE to reconcile.

Author

Xiaotong Yao