ERR3403276 : Run Browser : SRA Archive : NCBI

An official website of the United States governmentHere's how you knowHere's how you know

The .gov means it's official.

Federal government websites often end in .gov or .mil.
Before sharing sensitive information, make sure you're on a federal government site.

The site is secure.

The https:// ensures that you are connecting to the official website and
that any information you provide is encrypted and transmitted securely.

Sequence Read Archive

Run

Run	Spots	Bases	Size	GC Content	Data Status	Published
ERR3403276	887.4k	268.0M	105.8MB	55.2%	Public	2019-06-29

Quality graph (bigger)(smaller)

041

Phred quality score

This run has 2 reads per spot:

L=151, 100%

Legend

		Technical read
		Application Read
L=4, 100%		Length is 4, 100% spots contain this read
̅L=165, σ=92.8, 66%		Average length is 165, standard deviation is 92.8, 66% spots contain this read

Show 2 additional attributesHide attributes

ENA-FIRST-PUBLIC	2019-06-28
ENA-LAST-UPDATE	2019-06-28

Experiment

Experiment	Library Name	Platform	Strategy	Source	Selection	Layout	Action
ERX3427041	DN520489N:E7	Illumina	WGS	GENOMIC	RANDOM	PAIRED

Show designHide design

Illumina sequencing of library DN520489N:E7, constructed from sample accession ERS2295847 for study accession ERP024601. This is part of an Illumina multiplexed sequencing run (25964_2). This submission includes reads tagged with the sequence TACCGAGC.

Biosample

Biosample	Sample Description	Organism
SAMEA104694464 (ERS2295847)		Klebsiella grimontii

Bioproject

Bioproject	SRA Study	Title
PRJEB22252	ERP024601	Baby_Biome_Study_gastrointestinal_bacterial_genomes

Show abstractHide abstract

The study investigates how early microbe exposure and the developing immune system influence subsequent health and developmental outcomes. Culturing efforts coupled with whole genome sequencing of the gastrointestinal bacteria can address key questions that are computationally non-trivial using shotgun metagenomics alone. This comprehensive gut microbiota-derived bacterial genome collection provides the basis to improve the taxonomic classification resolution of metagenomic analysis, and to allow subsequent in vitro and in vivo experiments on host physiology and gut colonisation process. To get a broad and comprehensive coverage of the gut microbiota, we cultivated bacterial species from human faecal samples on different selectice agar media and broth. Samples are whole-genome sequenced on Illumina X10 150bp PE.

SRA archive data

SRA archive data is normalized by the SRA load process and used by the SRA Toolkit to read and produce formats like FASTQ, SAM, etc. The default toolkit configuration enables it to find and retrieve SRA runs by accession.

Public SRA files are now available from GCP and AWS cloud platforms as well as from NCBI. Access to most data in the cloud requires a user account with the cloud service provider. The user’s account will incur costs for cloud compute or to copy data outside of the specified cloud service region.

Type	Version	Created	Size	Location	Name	Free Egress	Access Type
SRA Normalized	1	2019-06-29	105.8MB	AWS	https://sra-pub-run-odp.s3.amazonaws.com/sra/ERR3403276/ERR3403276	worldwide	anonymous
SRA Lite	1	2020-06-19	64.3MB	NCBI	https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos3/sra-pub-zq-22/ERR003/403/ERR3403276.sralite.1	worldwide	anonymous
				AWS	s3://sra-pub-zq-7/ERR3403276/ERR3403276.sralite.1	s3.us-east-1	aws identity
				GCP	gs://sra-pub-zq-106/ERR3403276/ERR3403276.zq.1	gs.us-east1	gcp identity

Original format

The original files submitted to SRA. These files may require specific software to open, read and interpret data.

Type	Version	Created	Size	Location	Name	Free Egress	Access Type
cram	1	2019-08-24	105.2MB	EBI	http://ftp.sra.ebi.ac.uk/vol1/run/ERR340/ERR3403276/25964_2#341.cram	worldwide	anonymous

Egress and Access: what does it mean?

Egress - term used by cloud providers to describe cost (charged to the user) of moving data outside of the storage region

Free Egress column indicates where the data can be accessed without an egress charge:

worldwide - can be downloaded from anywhere for free
s3.us-east-1 - is free to access from machines running in Amazon's us-east-1 region, access from other regions or transport outside of AWS will require paying egress charges
gs.US - is free to access from machines running in Google’s gs.US region, access from other regions or transport outside of GCP will require paying egress charges

Access Type - describes whether a cloud service user account is necessary for data access. "anonymous" access means general public access

Name - column provides either a link to free download location at NCBI or a URL for the cloud provider storage location. Either s3:// for Amazon or gs:// for Google storage

Why is SRA data in the cloud?

In order to support large scale (hyper parallel) data analyses SRA data is now available at GCP and AWS with few caveats:

SRA data is copied to the cloud from NCBI. There may be a lag between availability from NCBI and from CSP (cloud service providers).
To access public data user account with the cloud service provider is required. Your account will incur costs for cloud compute and/or to copy data (either archival or results of your compute) outside of the specified cloud service region.
Distribution of protected data is signed by NIH account and requires user to operate in the same region as the data.

SRA has also begun to provide access to originally submitted source files:

not all files have been validated by SRA;
not all source files have been made available for different reasons (contaminations cleared on load but unsuitable for publishing, some subjects consented to share their virus but not their own genome, public health pipelines putting hospitals or plants id in their system in file name etc);
the volume of this type of data a much larger and it is not used as often so we will keep most of it on tape or "cold" storage in cloud. As a result, the data may not be available instantly and restore requests will be served on first-come first-served basis and cost of restore may be charged to your user account.

What is "Cloud Data Delivery"?

Please visit SRA Data Delivery service to request Sequence Read Archive (SRA) data to be delivered to an Amazon Web Services (AWS) or Google Cloud Platform (GCP) bucket of your choice.

HiSeq X Ten paired end sequencing (ERR3403276)