Federal government websites often end in .gov or .mil.
Before sharing sensitive information, make sure you're on a federal government site.
The https:// ensures that you are connecting to the official website and
that any information you provide is encrypted and transmitted securely.
HiSeq X Ten paired end sequencing (ERR3403276)
Run | Spots | Bases | Size | GC Content | Data Status | Published |
---|---|---|---|---|---|---|
ERR3403276 | 887.4k | 268.0M | 105.8MB | 55.2% | Public | 2019-06-29 |
This run has 2 reads per spot:
L=151, 100% | L=151, 100% |
Technical read | ||
Application Read | ||
L=4, 100% | Length is 4, 100% spots contain this read | |
̅L=165, σ=92.8, 66% |
Average length is 165, standard deviation is 92.8, 66% spots contain this read |
ENA-FIRST-PUBLIC | 2019-06-28 |
ENA-LAST-UPDATE | 2019-06-28 |
Experiment | Library Name | Platform | Strategy | Source | Selection | Layout | Action |
---|---|---|---|---|---|---|---|
ERX3427041 | DN520489N:E7 | Illumina | WGS | GENOMIC | RANDOM | PAIRED |
Biosample | Sample Description | Organism |
---|---|---|
SAMEA104694464 (ERS2295847) | Klebsiella grimontii |
Bioproject | SRA Study | Title |
---|---|---|
PRJEB22252 | ERP024601 | Baby_Biome_Study_gastrointestinal_bacterial_genomes |
The study investigates how early microbe exposure and the developing immune system influence subsequent health and developmental outcomes. Culturing efforts coupled with whole genome sequencing of the gastrointestinal bacteria can address key questions that are computationally non-trivial using shotgun metagenomics alone. This comprehensive gut microbiota-derived bacterial genome collection provides the basis to improve the taxonomic classification resolution of metagenomic analysis, and to allow subsequent in vitro and in vivo experiments on host physiology and gut colonisation process. To get a broad and comprehensive coverage of the gut microbiota, we cultivated bacterial species from human faecal samples on different selectice agar media and broth. Samples are whole-genome sequenced on Illumina X10 150bp PE.
SRA archive data is normalized by the SRA load process and used by the SRA Toolkit to read and produce formats like FASTQ, SAM, etc. The default toolkit configuration enables it to find and retrieve SRA runs by accession.
Public SRA files are now available from GCP and AWS cloud platforms as well as from NCBI. Access to most data in the cloud requires a user account with the cloud service provider. The user’s account will incur costs for cloud compute or to copy data outside of the specified cloud service region.
Type | Version | Created | Size | Location | Name | Free Egress | Access Type |
---|---|---|---|---|---|---|---|
SRA Normalized | 1 | 2019-06-29 | 105.8MB | AWS | https://sra-pub-run-odp.s3.amazonaws.com/sra/ERR3403276/ERR3403276 | worldwide | anonymous |
SRA Lite | 1 | 2020-06-19 | 64.3MB | NCBI | https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos3/sra-pub-zq-22/ERR003/403/ERR3403276.sralite.1 | worldwide | anonymous |
AWS | s3://sra-pub-zq-7/ERR3403276/ERR3403276.sralite.1 | s3.us-east-1 | aws identity | ||||
GCP | gs://sra-pub-zq-106/ERR3403276/ERR3403276.zq.1 | gs.us-east1 | gcp identity |
The original files submitted to SRA. These files may require specific software to open, read and interpret data.
Type | Version | Created | Size | Location | Name | Free Egress | Access Type |
---|---|---|---|---|---|---|---|
cram | 1 | 2019-08-24 | 105.2MB | EBI | http://ftp.sra.ebi.ac.uk/vol1/run/ERR340/ERR3403276/25964_2#341.cram | worldwide | anonymous |
Egress - term used by cloud providers to describe cost (charged to the user) of moving data outside of the storage region
Free Egress column indicates where the data can be accessed without an egress charge:
- worldwide - can be downloaded from anywhere for free
- s3.us-east-1 - is free to access from machines running in Amazon's us-east-1 region, access from other regions or transport outside of AWS will require paying egress charges
- gs.US - is free to access from machines running in Google’s gs.US region, access from other regions or transport outside of GCP will require paying egress charges
Access Type - describes whether a cloud service user account is necessary for data access. "anonymous" access means general public access
Name - column provides either a link to free download location at NCBI or a URL for the cloud provider storage location. Either s3:// for Amazon or gs:// for Google storage
In order to support large scale (hyper parallel) data analyses SRA data is now available at GCP and AWS with few caveats:
- SRA data is copied to the cloud from NCBI. There may be a lag between availability from NCBI and from CSP (cloud service providers).
- To access public data user account with the cloud service provider is required. Your account will incur costs for cloud compute and/or to copy data (either archival or results of your compute) outside of the specified cloud service region.
- Distribution of protected data is signed by NIH account and requires user to operate in the same region as the data.
SRA has also begun to provide access to originally submitted source files:
- not all files have been validated by SRA;
- not all source files have been made available for different reasons (contaminations cleared on load but unsuitable for publishing, some subjects consented to share their virus but not their own genome, public health pipelines putting hospitals or plants id in their system in file name etc);
- the volume of this type of data a much larger and it is not used as often so we will keep most of it on tape or "cold" storage in cloud. As a result, the data may not be available instantly and restore requests will be served on first-come first-served basis and cost of restore may be charged to your user account.
Please visit SRA Data Delivery service to request Sequence Read Archive (SRA) data to be delivered to an Amazon Web Services (AWS) or Google Cloud Platform (GCP) bucket of your choice.