|Run||Spots||Bases||Size||GC content||Published||Access Type|
This run has 2 reads per spot:
|L=101, 100%||L=101, 100%|
Technical read Application Read L=4, 100% Length is 4, 100% spots contain this read ̅L=165, σ=92.8, 66% Average length is 165, standard deviation is 92.8, 66% spots contain this read
|SAMN02981228 (SRS892297)||Rhesus monkey ID 17573 is a female from the Southwest National Primate Research Center at the Southwest Foundation for Biomedical Research in San Antonio, TX who was born Dec. 21, 1990 in San Antonio and died Jan. 25, 2002. All animals in that branch of the colony are derived from Indian origin rhesus brought to the US many years ago. This animal is not likely to be closely related to the Indian origin animal that provided DNA for the BAC library.||Macaca mulatta|
|SRP010319||Rhesus macaque17573 Exome sequencing|
Genomic DNA was isolated from a rhesus macaque (Macaca mulatta), animal number 17573 (reference animal). A human exome kit, the SureSelect XT HumanAllExon 50Mb kit (G7544A), was used to pull down fragments containing exons. Fragments were sequenced on an Illumina HiSeq (2X101). This study is part of a project to improve the rhesus macaque reference assembly and annotation. The exomic data can also be used to identify the frequency of mutations in rhesus exons.
You need SRA Toolkit to operate on SRA runs.
Default toolkit configuration enables it to find and retrieve SRA runs by accession. It also downloads (and cache) only the part of data you really need. For example quality scores represent a majority of data volume and you may not need them if you dump fasta only (versus fastq). Or if you are looking at particular gene you may not need reads aligned to other regions or not aligned at all. Same way if you use GATK with enabled SRA support you need only SRA run accessions to fire your process.
fastq-dump will dump reads in a number of "standard" fastq and fasta formats.
vdb-dump is also capable of producing fasta and fastq (beside other formats). It dumps data much faster then fastq-dump but ordering of reads may be different and it does not produce split-read multi-file output.
Prefetch tool will help you cache all data in advance if you plan to run data analysis in environment where getting data from NCBI at run time is unfeasible.
Read more at SRA Knowledge Base on how to download SRA data using command line utilities.