skip to main content

SRA Toolkit Documentation

Back to List of the Tools

Tool: sam-dump

Usage:
sam-dump [options] <path/file> [<path/file> ...]
sam-dump [options] <accession>
Frequently Used Options:
General:
-h | --help Displays ALL options, general usage, and version information
-V | --version Display the version of the program
Data formatting:
-1 | --primary Output only primary alignments
-c | --cigar-long Output long version of CIGAR
-r | --header Always reconstruct header
-n | --no-header Do not output headers
-s | --seqid Print reference SEQ_ID in RNAME instead of NAME
-= | --hide-identical Output '=' if base is identical to reference
--reverse Reverse unaligned reads according to read type
--rna-splicing modify cigar-string and output flags if rna-splicing detected
Filtering:
-u | --unaligned Output unaligned reads along with aligned reads
--aligned-region <name[:from-to]> Filter by position on genome. Name can either be file specific name (ex: "chr1" or "1"). "from" and "to" (inclusive) are 1-based coordinates
--matepair-distance <from-to|'unknown'> Filter by distance between matepairs. Use "unknown" to find matepairs split between the references. Use from-to (inclusive) to limit matepair distance on the same reference
--unaligned-spots-only output reads for spots with no aligned reads
--min-mapq min. mapq an alignment has to have, to be printed
Workflow and piping:
--output-file print output into this file (instead of STDOUT)
--gzip Compress output using gzip
--bzip2 Compress output using bzip2
--option-file <file> Read more options and parameters from the file.
Use examples:
sam-dump SRR390728
Output SAM format data to standard out. Alignment information is not required to output in this format.
sam-dump --aligned-region 1:6484848-6521430 --output-file SRR390728.sam SRR390728
Store output in the file SRR390728.sam for only the region 6484848-6521430 on chromosome 1. The sequence name as submitted for the alignment (@SQ SN in SAM/BAM files) or the reference sequence accession must be used.
sam-dump SRR390728 | samtools view -bS - > SRR390728.bam
With "samtools" installed the above command pipes (|) the sam-dump output directly to samtools for conversion directly into .bam format (view -bS; the "-" following -bS allow samtools to read the streaming data from sam-dump).
sam-dump -r --gzip --output-file SRR390728.sam.gz SRR390728
Produces gzip’d (--gzip) output file (--output-file) SRR390728.sam.gz that has a reconstructed header(-r). Will include reference accessions but may not include some header info from the submitted data.
Possible errors and their solution:
sam-dump.2.x err: item not found while reading file - input object(s) not found
This error indicates that the .sra file cannot be found. Confirm that the path to the file is correct.
sam-dump.2.x int: name not found while resolving tree within virtual file system module - VCursorCellDataDirect( row#1 . idx#3 . READ ) char_ptr failed
The data are likely reference compressed and the toolkit is unable to acquire the reference sequence(s) needed to extract the .sra file. Please confirm that you have tested and validated the configuration of the toolkit. If you have elected to prevent the toolkit from contacting NCBI, you will need to manually acquire the reference(s) here