WEB service 'SRA Database Backend'

SRA Database Backend runs MS SQL pre-defined set of stored procedures.

How it works

The SRA Database Backend has several endpoints and accepts additional optional parameters

Some parameters are pre-defined, other do not. Application looks at its INI file for [proc]/<endpoint>

As example, for endpoint 'provisional_table' it can be GET_Submission_table 1, '<F provisional="1" />', @begin@, @end@, @ord@, @dir@

Application expects CGI parameters 'begin', 'end', 'ord', and 'dir', too. Value of those parameters will be sanitized and replaced their placeholders. 'null' will be set for placefolders which have not their value from CGI parameters. Application runs SQL request and displays output.

List of available endpoints and requests

Output is in XML format if not other is mentioned.

  1. study_table?begin=1&end=10&ord=acc&dir=a - table of studies, ordered by 'ord' in direction 'dir'.
  2. study_table?begin=1&end=10&ord=acc&dir=a&uid=2,3 - table of studies, ordered by 'ord' in direction 'dir', for Entrez UIDs 2,3.
  3. study_table?begin=1&end=10&ord=acc&dir=a&uid=<IdList><Id>2</Id><Id>3</Id></IdList> - table of studies, ordered by 'ord' in direction 'dir', for Entrez UIDs 2,3.
  4. study_table?begin=1&end=10&ord=acc&dir=a&term=cat - table of studies, ordered by 'ord' in direction 'dir', for all UIDs in Entrez SRA databases, search for term='cat'.
  5. provisional_table?begin=1&end=10&ord=acc&dir=a - table of provisional studies.
  6. analysis_table?begin=1&end=10 - table of analyses.
  7. run?acc=SRR000002 - metadata for given run.
  8. run_new?acc=SRR000002 - metadata for given run (new version, not in production).
  9. experiment?acc=SRX000002 - metadata for given experiment.
  10. exp?acc=SRX000002 - package metadata for given experiment.
  11. exp?uid=2,3 - package metadata for given experiment Entrez UIDs. You cannot mix UIDs and accession in one request.
  12. exp_status?acc=SRX000002 - get status for given experiment (only from internal NCBI network).
  13. sample?acc=SRS000003 - metadata for given sample.
  14. study?acc=SRP000002 - metadata for given study accession.
  15. analysis?acc=DRZ000003 - metadata for given analysis accession.
  16. rao/run=SRR1199225&ref=chr1 - run alignment options.
  17. run_taxonomy?acc=SRR000002 - taxonomy analysis tree.
  18. acclist?acc=SRX000002,SRR000001 - list of runs for given accession list, text output by default.
  19. runinfo?acc=SRX00000,SRR000001&uid=2,4 - list of runs attributes for given acession list and/or list of Entrez UIDs', text output by default.
  20. runinfo4pathogen?acc=SRX000002,SRR000001&uid=2,4 - list of runs attributes for given acession list and/or list of Entrez UIDs, adapted for pathogens, text output by default.
  21. run_by_exp?acc=SRR000002 - list of rans belong to experiment of given run, text output by default.
  22. csv - comma separated table of statistic information about SRA, text output.
  23. wgs_path?acc=AAAA02 - metadata for given WGS accession.
  24. dump2blast?exp=2 - get "dump data to Blast" SRA run metadata for given experiment Entrez UID.
  25. dump2blast?exp=SRX000002 - get "dump data to Blast" SRA run metadata for given experiment accession.
  26. dump2blast?from=2021-02-14T20:00:00&to=2021-02-15 - get "dump data to Blast" SRA run metadata for given period of time (up to 3 day).
  27. dump2blast?from=2021-02-14T20:00:00&to=2021-02-15 - get SRA runs updated for given period of time (up to 3 day).
  28. submission_status?acc=SRX000002&&signer-name=<signer name>&signer-key=<key name> and signature in header: x-sdl-signature=<signature> - get information about submission status for given experiment. Required signed HTTP request.
  29. submission-controlled-vocabulary - get Submission controlled vocabulary for platforms, strategies, sources, selections.

Pre-defined CGI parameters

  1. sp - alias for endpoint.
  2. acc - accession or comma delimited list of SRA accessions.
  3. arg - synonym for 'acc' (for backward compatibility with 'anysql' web service).
  4. uid - comma delimited list of Entrez UIDs or something like <IdList><Id>2</Id><Id>3</Id></IdList>. 2 and 3 be extraced.
  5. term - term for Entrez search backend. For some requests (study_table) non-empty 'term' will call search backend for list of UIDs (up to 1,000,000,000).
  6. mode - if set in 'hup', request 'runinfo' will provide data for HUP runs (only for NCBI internal users).
  7. db - for debug purposes, if set in some 'name', server 'name' will be used instead of default 'SRA_READ'.
  8. retmode - if set in 'xml', output in XML format will be produced, otherwise application produces 'text' output. Default report will be generated if omitted.
  9. query_key - query key passed from eSearch Entrez Backend. Must be combined with WebEnv.
  10. WebEnv - history ID passed from eSearch Entrez Backend alongside with query key.

How to create signed HTTP request

        #/bin/bash
        GetSubmissionStatus() {
        local acc="$1"
        local signer_name="$2"
        local signer_key="$3"
        local keyfile="$4"

        local url="https://trace.ncbi.nlm.nih.gov/Traces/sra-db-be/submission_status"
        local expired_seconds=30

        local et=$(( $(date +%s ) + $expired_seconds ))
        local ip=$(hostname -i )
        local request="acc=$acc&et=$et&signer-name=$signer_name&signer-key=$signer_key"
        local signature=$(echo -n "$request$ip" | openssl dgst -sha256 -sign $keyfile | /usr/bin/base64 | perl -p -e 's/\n/%0A/' | sed -e 's/=/%3D/g' -e 's/+/%2B/g' -e 's|/|%2F|g')
        #wget --header "x-sdl-signature:$signature" "$url?$request"
        curl -sH "x-sdl-signature:$signature" "$url?$request"
        }

        GetSubmissionStatus SRX000002 "test" "key_1" "private.pem"
    

Updated: Wed Jan 3 11:38:39 EST 2024