|Run||Spots||Bases||Size||GC content||Published||Access Type|
This run has 2 reads per spot:
|L=4, 100%||̅L=263, σ=47.8, 100%|
Technical read Application Read L=4, 100% Length is 4, 100% spots contain this read ̅L=165, σ=92.8, 66% Average length is 165, standard deviation is 92.8, 66% spots contain this read
|PRJNA28969||SRP000101||Metagenomics and metatranscriptomics of marine plankton under conditions of ocean acidification|
The dissolution of anthropogenic CO2 from the atmosphere into the oceans will decrease surface pH by 0.3 units over the next 100 years (IPCC 2007); the phenomenon is referred to as ocean acidification (OA). There are concerns that OA will affect marine microorganisms, with significant impacts on marine biogeochemical cycling. To investigate the impact of this on marine microorganisms, a mesocosm experiment was set-up in a Norwegian Fjord in May 2006. Six bags containing 11,000 L of sea water were suspended in a Coastal Fjord. CO2 was bubbled through three of these bags to simulate ocean acidification conditions in the year 2100. The other three bags were bubbled with air. A phytoplankton bloom was induced in all six bags and phytoplankton, bacterioplankton and physiochemical characteristics were measured and analyzed over a 18 day period. Water samples from the peak of the phytoplankton bloom were isolated and used to follow the decline of the phytoplankton bloom. Nucleic acid extractions were performed to analyse bacterial diversity and functionality using 454 metagenomics and 454 metatranscriptomics. Sequencing the metatranscriptome can provide information about the response of organisms to varying environmental conditions. A methodology for obtaining random whole-community mRNA from a complex microbial assemblage using pyrosequencing was used. The metatranscriptome had, with minimum contamination by ribosomal RNA, significant coverage of abundant transcripts, and included significantly more potentially novel proteins than in the metagenome. Four 454 metatranscriptomic datasets and four 454 metagenomic datasets have been produced. These were derived from 4 samples: Day 1, High CO2 Bag and Day 1, Present Day Bag, refer to the metatranscriptomes from the peak of the bloom; Day 2, High CO2 Bag and Day 2, Present Day Bag, refer to the metatranscriptomes following the decline of the bloom. High CO2 refers to the ocean acidification mesocosm and Present Day refers to the control mesocosm. <a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=&db=Nucleotide&cmd=search&term=EU012135:EU012221[accn]">EU012135-EU012221</a> , <a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=&db=Nucleotide&cmd=search&term=EU421954:EU421957[accn]">EU421954-EU421957</a> , <a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=&db=Nucleotide&cmd=search&term=EU410956[accn]">EU410956</a> , <a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=&db=Nucleotide&cmd=search&term=EU476008[accn]">EU476008</a> , and <a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=&db=Nucleotide&cmd=search&term=EU819141:EU819142[accn]">EU819141-EU819142</a> are genomic sequences derived from this study. 454 sequence data is available from the Short Read Archive (SRA): <a href="ftp://ftp.ncbi.nih.gov/pub/TraceDB/ShortRead/SRA000266 ">SRA000266</a>. 454-metatranscriptomic data is available from the Gene Expression Omnibus (GEO): <a href="http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE10119 ">GSE10119</a>.
You need SRA Toolkit to operate on SRA runs.
Default toolkit configuration enables it to find and retrieve SRA runs by accession. It also downloads (and cache) only the part of data you really need. For example quality scores represent a majority of data volume and you may not need them if you dump fasta only (versus fastq). Or if you are looking at particular gene you may not need reads aligned to other regions or not aligned at all. Same way if you use GATK with enabled SRA support you need only SRA run accessions to fire your process.
fastq-dump will dump reads in a number of "standard" fastq and fasta formats.
vdb-dump is also capable of producing fasta and fastq (beside other formats). It dumps data much faster then fastq-dump but ordering of reads may be different and it does not produce split-read multi-file output.
Prefetch tool will help you cache all data in advance if you plan to run data analysis in environment where getting data from NCBI at run time is unfeasible.
Read more at SRA Knowledge Base on how to download SRA data using command line utilities.
The sections below show results of analysis run by software which is still in experimental stage. Please use provided results with a boatload of salt and let us know what you think.
-- SRA team
- Unidentified reads: 100%