SNPsplit
Full list of options for SNPsplit
Note
USAGE: SNPsplit [options] --snp_file <SNP.file.gz> [input file(s)]
Input file(s)
Mapping output file in SAM or BAM format. SAM files (ending in .sam
) will first be converted to BAM files.
--snp_file
Mandatory file specifying SNP positions to be considered, may be a plain text file of gzip compressed. Currently, the SNP file is expected to be in the following format:
Only the information contained in fields 'Chromosome', 'Position' and 'Ref/SNP base' are being used for analysis. The genome referred to as 'Ref' will be used as genome 1, the genome containing the 'SNP' base as genome 2.
--single_end
Manually sets data to single-end. Skips AUTO-DETECT
--paired
Paired-end mode. (Default: AUTO-DETECT)
-o/--outdir <dir>
Write all output files into this directory. By default the output files will be written into the same folder as the input file(s). If the specified folder does not exist, SNPsplit will attempt to create it first. The path to the output folder can be either relative or absolute.
--singletons
If the allele-tagged paired-end file also contains singleton alignments (which is the default for e.g. TopHat), these will be written out to extra files (ending in _st.bam
) instead of writing everything to combined paired-end and singleton files. Default: OFF.
--no_sort
This option skips the sorting step if BAM files are already sorted by read name (e.g. Hi-C files generated by HiCUP). Please note that setting --no_sort
for unsorted paired-end files will break the tagging process!
--hic
Assumes Hi-C data processed with HiCUP as input, i.e. the input BAM file is paired-end and Reads 1 and 2 follow each other. Thus, this option also sets the flags --paired
and --no_sort
. Default: OFF.
--bisulfite
Assumes Bisulfite-Seq data processed with Bismark as input. In paired-end mode (--paired
), Read 1 and Read 2 of a pair are expected to follow each other in consecutive lines. SNPsplit will run a quick check at the start of a run to see if the provided file appears to be a Bismark file, and set the flags --bisulfite
and/or --paired
automatically. In addition it will perform a quick check to see if a paired-end file appears to have been positionally sorted, and if not will set the flag --no_sort
.
--samtools-path
The path to your Samtools installation, e.g. /home/user/samtools/
. Does not need to be specified explicitly if Samtools is in the PATH
environment already.
SNPsplit-sort specific options (tag2sort):
--sam
The output will be written out in SAM format instead of BAM (default). SNPsplit will attempt to use the path to Samtools that was specified with --samtools_path
, or, if it hasn't been specified, attempt to find Samtools in the PATH
environment. If no installation of Samtools can be found, the SAM output will be compressed with GZIP instead (yielding a .sam.gz
output file).
--skip_tag2sort
Carry out the allele-tagging process, and exit afterwards. This might be desirable when using SNPsplit in pipelining systems, such as Nextflow, when a deduplication step is to be added following allele tagging.
--conflicting/--weird
Reads or read pairs that were classified as 'Conflicting' (XX:Z:CF
) will be written to an extra file (ending in .conflicting.bam
) instead of being simply skipped. Reads may be classified as 'Conflicting' if a single read contains SNP information for both genomes at the same time, or if the SNP position was deleted from the read. Read-pairs are considered 'Conflicting' if either read is was tagged with the XX:Z:CF
flag. Default: OFF.
--help
Displays this help information and exits
--verbose
(Very!) verbose output (for debugging)
--version
Displays version information and exits