-
Notifications
You must be signed in to change notification settings - Fork 0
Search API Docs
Some resources:
- Documentation: https://data.microbiomedata.org/docs
Some of the API requires authentication. You will be automatically authenticated if you are signed in with your ORCiD account on the data portal website. As long as you have an active session in the portal, you can use the swagger API page.
This API relies on cookie auth.
- Find the session cookie for your current browser login
chrome://settings/siteData?searchSubpage=microbiomedata&search=cookies
- Your CURL request will look like this:
curl -X 'GET' 'https://data.dev.microbiomedata.org/api/metadata_submission?offset=0&limit=25' -H 'accept: application/json' --cookie "session=PASTE_COOKIE_HERE"
- Study ID lookup: https://data.microbiomedata.org/api/study/gold:Gs0114663
- Sample ID lookup: https://data.microbiomedata.org/api/biosample/gold:Gb0126437
- Sample search: https://data.microbiomedata.org/docs#/biosample/Search_for_biosamples_api_biosample_search_post
- Study search: https://data.microbiomedata.org/docs#/study/Search_for_studies_api_study_search_post
Search is constructed by POST request with a JSON body that describes the query.
- See what you can search by using https://data.microbiomedata.org/api/summary
- You can also interactively learn how to build search payloads by using the Chrome debug tools network inspector.
Example 1:
{"conditions":[{"value":"gold:Gs0114675","table":"study","op":"==","field":"study_id"}],"data_object_filter":[]}
Example 2:
{"conditions":[{"op":"==","field":"omics_type","value":"Organic Matter Characterization","table":"omics_processing"}],"data_object_filter":[]}
The response JSON structure for most endpoints can be inferred from looking at the TypeScript interfaces.
For example, study search returns a SearchResponse<StudySearchResults>
, which can be interpreted as a SearchResponse
where the generic result slot is typed as StudySearchResults
.
The following code snippet can help construct and run queries from Python.
import requests
import json
class Query(object):
offset = 0
limit = 15
item = None
filters = []
results = None
def __getitem__(self, key):
if isinstance(key, slice):
if key.start is None:
self.offset = 0
else:
self.offset = key.start
if key.stop is None:
self.limit = 100 - key.start
else:
self.limit = key.stop - key.start
return self
elif isinstance(key, int):
self.offset = key
self.limit = 1
return self
def __iter__(self):
# print(self._request().json())
if self.results is None:
try:
self.results = self._request().json()["results"]
except:
print(self._request().json())
for result in self.results:
yield result
def filter(self, **kwargs):
if kwargs.get("item") is not None:
table = kwargs["item"]
else:
table = self.item
for arg in kwargs:
if arg == "item":
continue
self.filters.append(dict(op="==", field=arg, value=kwargs[arg], table=table))
return self
def _request(self):
url = f"https://data.microbiomedata.org/api/{self.item}/search?offset={self.offset}&limit={self.limit}"
data = json.dumps(dict(conditions=self.filters))
# print(url)
# print(data)
return requests.post(url, data=data)
class BiosampleQuery(Query):
item = "biosample"
class StudyQuery(Query):
item = "study"
class OmicsProcessingQuery(Query):
item = "omics_processing"
With that code loaded you can perform queries such as:
query = (BiosampleQuery()
.filter(ecosystem_type="Soil")
.filter(item="gene_function", id="KEGG.ORTHOLOGY:K00003"))
for item in query[0:5]:
print(item["name"])
Filters assume the fields apply to the current query type, which in this example allows searching by ecosystem_type
without specifying the item
. Gene functions, on the other hand, must be joined in so the item="gene_function"
is necessary to indicate the id
field applies a gene function.
The slicing operation [0:5]
on the query above sets an offset and limit on the search to retrieve partial results.
This example prints the download URLs of assembly contig fasta files of soil samples:
query = OmicsProcessingQuery().filter(omics_type="Metagenome").filter(item="biosample", ecosystem_type="Soil")
for item in query[0:20]:
for omics in item["omics_data"]:
for output in omics["outputs"]:
if output["file_type"] == "Assembly Contigs":
print(f"https://data.microbiomedata.org{output['url']}")
This example finds studies with James Stegen as PI:
query = StudyQuery().filter(principal_investigator_name="James Stegen")
for item in query:
print(item["name"])
The following searches for biosamples based on the following criteria
- PI name ("Mitchel J. Doktycz")
- Broad-scale Envioronmental Context ("terrestrial biome" [ENVO:00000446])
- Environmental medium ("bulk soil" [ENVO:00005802])
- Omics type ("metagenome")
Find all biosamples matching this criteria
POST
https://data.dev.microbiomedata.org/api/biosample/search
Payload
{
"conditions": [
{
"op": "==",
"field": "principal_investigator_name",
"value": "Mitchel J. Doktycz",
"table": "study"
},
{
"op": "==",
"field": "env_broad_scale",
"value": "terrestrial biome",
"table": "biosample"
},
{
"op": "==",
"field": "env_medium",
"value": "bulk soil",
"table": "biosample"
},
{
"op": "==",
"field": "omics_type",
"value": "Metagenome",
"table": "omics_processing"
},
{
"op": "==",
"field": "processing_institution",
"value": "JGI",
"table": "omics_processing"
}
],
"data_object_filter": []
}
Response (truncated)
{
"count": 113,
"results": [
{
"id": "gold:Gb0291625",
"name": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA - BESC-388-CL2_50_4",
"description": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA",
"alternate_identifiers": [
"img.taxon:3300046812",
"gold:Gb0291625"
],
"annotations": {
"type": "nmdc:Biosample",
"habitat": "Soil",
"lat_lon": "46.1216 -123.2701",
"location": "poplar common garden site in Clatskanie, Oregon, USA",
"community": "microbial communities",
"identifier": "BESC-388-CL2_50_4",
"geo_loc_name": "USA: Oregon",
"ncbi_taxonomy_name": "soil metagenome",
"sample_collection_site": "Bulk Soil"
},
"study_id": "gold:Gs0154044",
"depth": null,
"env_broad_scale_id": "ENVO:00000446",
"env_local_scale_id": "ENVO:00000011",
"env_medium_id": "ENVO:00005802",
"longitude": -123.2701,
"latitude": 46.1216,
"add_date": "2021-04-30T00:00:00",
"mod_date": "2021-06-16T00:00:00",
"collection_date": "2020-09-09T00:00:00",
"ecosystem": "Environmental",
"ecosystem_category": "Terrestrial",
"ecosystem_type": "Soil",
"ecosystem_subtype": "Botanical garden",
"specific_ecosystem": "Bulk soil",
"open_in_gold": "https://gold.jgi.doe.gov/biosample?id=Gb0291625",
"env_broad_scale": {
"id": "ENVO:00000446",
"label": "terrestrial biome",
"url": "http://purl.obolibrary.org/obo/ENVO:00000446",
"data": {}
},
"env_local_scale": {
"id": "ENVO:00000011",
"label": "garden",
"url": "http://purl.obolibrary.org/obo/ENVO:00000011",
"data": {}
},
"env_medium": {
"id": "ENVO:00005802",
"label": "bulk soil",
"url": "http://purl.obolibrary.org/obo/ENVO:00005802",
"data": {}
},
"env_broad_scale_terms": [
"terrestrial biome",
"biome",
"ecosystem",
"environmental system",
"system",
"material entity",
"independent continuant",
"continuant",
"entity"
],
"env_local_scale_terms": [
"garden",
"anthropogenic geographic feature",
"geographic feature",
"astronomical body part",
"fiat object",
"material entity",
"independent continuant",
"continuant",
"entity"
],
"env_medium_terms": [
"bulk soil",
"soil",
"environmental material",
"fiat object",
"material entity",
"independent continuant",
"continuant",
"entity"
],
"omics_processing": [
{
"id": "gold:Gp0566675",
"name": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA - BESC-388-CL2_50_4",
"description": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA",
"alternate_identifiers": [],
"annotations": {
"type": "nmdc:OmicsProcessing",
"omics_type": "Metagenome",
"ncbi_project_name": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA - BESC-388-CL2_50_4",
"principal_investigator": "Mitchel J. Doktycz",
"processing_institution": "JGI"
},
"study_id": "gold:Gs0154044",
"biosample_id": "gold:Gb0291625",
"add_date": "2021-12-23T00:00:00",
"mod_date": "2021-12-23T00:00:00",
"open_in_gold": "https://gold.jgi.doe.gov/project?id=Gp0566675",
"omics_data": [
{
"id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
"name": "Read QC Activity for nmdc:mga0xys903",
"type": "nmdc:ReadQCAnalysisActivity",
"git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
"started_at_time": "2022-01-05T01:04:43",
"ended_at_time": "2022-01-05T15:56:24",
"execution_resource": "NERSC-Cori",
"omics_processing_id": "gold:Gp0566675",
"outputs": [
{
"id": "nmdc:8b510e76d421c83bed7845501a8918cf",
"name": "gold:Gp0566675_Filtered Reads",
"description": "Filtered Reads for gold:Gp0566675",
"file_size_bytes": 10610637527,
"md5_checksum": "8b510e76d421c83bed7845501a8918cf",
"url": "/api/data_object/nmdc%3A8b510e76d421c83bed7845501a8918cf/download",
"downloads": 0,
"file_type": "Filtered Sequencing Reads",
"file_type_description": "Reads QC result fastq (clean data)",
"selected": false
},
{
"id": "nmdc:dd7082801273dca2dd1da7a5119deda7",
"name": "gold:Gp0566675_Filtered Stats",
"description": "Filtered Stats for gold:Gp0566675",
"file_size_bytes": 3646,
"md5_checksum": "dd7082801273dca2dd1da7a5119deda7",
"url": "/api/data_object/nmdc%3Add7082801273dca2dd1da7a5119deda7/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
}
]
},
{
"id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
"name": "Assembly Activity for nmdc:mga0xys903",
"type": "nmdc:MetagenomeAssembly",
"git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
"started_at_time": "2022-01-05T01:04:43",
"ended_at_time": "2022-01-05T15:56:24",
"execution_resource": "NERSC-Cori",
"omics_processing_id": "gold:Gp0566675",
"outputs": [
{
"id": "nmdc:ff5ebf0b98afeaf8c5cbde3689457ab1",
"name": "gold:Gp0566675_Assembled contigs fasta",
"description": "Assembled contigs fasta for gold:Gp0566675",
"file_size_bytes": 1498528772,
"md5_checksum": "ff5ebf0b98afeaf8c5cbde3689457ab1",
"url": "/api/data_object/nmdc%3Aff5ebf0b98afeaf8c5cbde3689457ab1/download",
"downloads": 0,
"file_type": "Assembly Contigs",
"file_type_description": "Final assembly contigs fasta",
"selected": false
},
{
"id": "nmdc:6588474c6a672969ea116ad47680e9e6",
"name": "gold:Gp0566675_Assembled scaffold fasta",
"description": "Assembled scaffold fasta for gold:Gp0566675",
"file_size_bytes": 1491586035,
"md5_checksum": "6588474c6a672969ea116ad47680e9e6",
"url": "/api/data_object/nmdc%3A6588474c6a672969ea116ad47680e9e6/download",
"downloads": 0,
"file_type": "Assembly Scaffolds",
"file_type_description": "Final assembly scaffolds fasta",
"selected": false
},
{
"id": "nmdc:d41d8cd98f00b204e9800998ecf8427e",
"name": "gold:Gp0452679_metabat2 bin checkm quality assessment result",
"description": "metabat2 bin checkm quality assessment result for gold:Gp0452679",
"file_size_bytes": 0,
"md5_checksum": "d41d8cd98f00b204e9800998ecf8427e",
"url": "/api/data_object/nmdc%3Ad41d8cd98f00b204e9800998ecf8427e/download",
"downloads": 0,
"file_type": "CheckM Statistics",
"file_type_description": "CheckM statistics report",
"selected": false
},
{
"id": "nmdc:7bc999a16ca8e3bfb555d51199d0da8f",
"name": "gold:Gp0566675_Metagenome Alignment BAM file",
"description": "Metagenome Alignment BAM file for gold:Gp0566675",
"file_size_bytes": 12535917798,
"md5_checksum": "7bc999a16ca8e3bfb555d51199d0da8f",
"url": "/api/data_object/nmdc%3A7bc999a16ca8e3bfb555d51199d0da8f/download",
"downloads": 0,
"file_type": "Assembly Coverage BAM",
"file_type_description": "Sorted bam file of reads mapping back to the final assembly",
"selected": false
}
]
},
{
"id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
"name": "Annotation Activity for nmdc:mga0xys903",
"type": "nmdc:MetagenomeAnnotation",
"git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
"started_at_time": "2022-01-05T01:04:43",
"ended_at_time": "2022-01-05T15:56:24",
"execution_resource": "NERSC-Cori",
"omics_processing_id": "gold:Gp0566675",
"outputs": [
{
"id": "nmdc:f2fe8b903fbc33dc52b7070d5f025c26",
"name": "gold:Gp0566675_Protein FAA",
"description": "Protein FAA for gold:Gp0566675",
"file_size_bytes": 452210263,
"md5_checksum": "f2fe8b903fbc33dc52b7070d5f025c26",
"url": "/api/data_object/nmdc%3Af2fe8b903fbc33dc52b7070d5f025c26/download",
"downloads": 0,
"file_type": "Annotation Amino Acid FASTA",
"file_type_description": "FASTA amino acid file for annotated proteins",
"selected": false
},
{
"id": "nmdc:d36ff177b4125d3f4d85e23485bed0d9",
"name": "gold:Gp0566675_Structural annotation GFF file",
"description": "Structural annotation GFF file for gold:Gp0566675",
"file_size_bytes": 211518868,
"md5_checksum": "d36ff177b4125d3f4d85e23485bed0d9",
"url": "/api/data_object/nmdc%3Ad36ff177b4125d3f4d85e23485bed0d9/download",
"downloads": 0,
"file_type": "Structural Annotation GFF",
"file_type_description": "GFF3 format file with structural annotations",
"selected": false
},
{
"id": "nmdc:2616f54fe6604bd965f5174c174d0c67",
"name": "gold:Gp0566675_Functional annotation GFF file",
"description": "Functional annotation GFF file for gold:Gp0566675",
"file_size_bytes": 387519457,
"md5_checksum": "2616f54fe6604bd965f5174c174d0c67",
"url": "/api/data_object/nmdc%3A2616f54fe6604bd965f5174c174d0c67/download",
"downloads": 0,
"file_type": "Functional Annotation GFF",
"file_type_description": "GFF3 format file with functional annotations",
"selected": false
},
{
"id": "nmdc:419f975debb1e88caddfa1859c99cddb",
"name": "gold:Gp0566675_KO TSV file",
"description": "KO TSV file for gold:Gp0566675",
"file_size_bytes": 44909434,
"md5_checksum": "419f975debb1e88caddfa1859c99cddb",
"url": "/api/data_object/nmdc%3A419f975debb1e88caddfa1859c99cddb/download",
"downloads": 0,
"file_type": "Annotation KEGG Orthology",
"file_type_description": "Tab delimited file for KO annotation",
"selected": false
},
{
"id": "nmdc:c84ba3a661301b45bbb6b56d889a956b",
"name": "gold:Gp0566675_EC TSV file",
"description": "EC TSV file for gold:Gp0566675",
"file_size_bytes": 29285396,
"md5_checksum": "c84ba3a661301b45bbb6b56d889a956b",
"url": "/api/data_object/nmdc%3Ac84ba3a661301b45bbb6b56d889a956b/download",
"downloads": 0,
"file_type": "Annotation Enzyme Commission",
"file_type_description": "Tab delimited file for EC annotation",
"selected": false
},
{
"id": "nmdc:089df71a6e2994db942945a05122bbfc",
"name": "gold:Gp0566675_COG GFF file",
"description": "COG GFF file for gold:Gp0566675",
"file_size_bytes": 243168059,
"md5_checksum": "089df71a6e2994db942945a05122bbfc",
"url": "/api/data_object/nmdc%3A089df71a6e2994db942945a05122bbfc/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:99ef3156c471c206d762eb6264f8354a",
"name": "gold:Gp0566675_PFAM GFF file",
"description": "PFAM GFF file for gold:Gp0566675",
"file_size_bytes": 234457003,
"md5_checksum": "99ef3156c471c206d762eb6264f8354a",
"url": "/api/data_object/nmdc%3A99ef3156c471c206d762eb6264f8354a/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:dffd918b20d93009a036cf263dc10f97",
"name": "gold:Gp0566675_TigrFam GFF file",
"description": "TigrFam GFF file for gold:Gp0566675",
"file_size_bytes": 33550202,
"md5_checksum": "dffd918b20d93009a036cf263dc10f97",
"url": "/api/data_object/nmdc%3Adffd918b20d93009a036cf263dc10f97/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:bcc0b66e2d64c26413aa09af4f92dee2",
"name": "gold:Gp0566675_SMART GFF file",
"description": "SMART GFF file for gold:Gp0566675",
"file_size_bytes": 65877391,
"md5_checksum": "bcc0b66e2d64c26413aa09af4f92dee2",
"url": "/api/data_object/nmdc%3Abcc0b66e2d64c26413aa09af4f92dee2/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:9b4bfcafe1a92151f2c9bb4fb626aef9",
"name": "gold:Gp0566675_SuperFam GFF file",
"description": "SuperFam GFF file for gold:Gp0566675",
"file_size_bytes": 296375492,
"md5_checksum": "9b4bfcafe1a92151f2c9bb4fb626aef9",
"url": "/api/data_object/nmdc%3A9b4bfcafe1a92151f2c9bb4fb626aef9/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:43922b8d0f06dc9477e66537ba176dd3",
"name": "gold:Gp0566675_Cath FunFam GFF file",
"description": "Cath FunFam GFF file for gold:Gp0566675",
"file_size_bytes": 262861674,
"md5_checksum": "43922b8d0f06dc9477e66537ba176dd3",
"url": "/api/data_object/nmdc%3A43922b8d0f06dc9477e66537ba176dd3/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:db4d0e939563ce9a94ecd47bc3ef402c",
"name": "gold:Gp0566675_CRT GFF file",
"description": "CRT GFF file for gold:Gp0566675",
"file_size_bytes": 89820,
"md5_checksum": "db4d0e939563ce9a94ecd47bc3ef402c",
"url": "/api/data_object/nmdc%3Adb4d0e939563ce9a94ecd47bc3ef402c/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:e84c2d487b63505758057f580e03447c",
"name": "gold:Gp0566675_Genemark GFF file",
"description": "Genemark GFF file for gold:Gp0566675",
"file_size_bytes": 280641652,
"md5_checksum": "e84c2d487b63505758057f580e03447c",
"url": "/api/data_object/nmdc%3Ae84c2d487b63505758057f580e03447c/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:5373f555206a0f45624aff162fd2dce7",
"name": "gold:Gp0566675_Prodigal GFF file",
"description": "Prodigal GFF file for gold:Gp0566675",
"file_size_bytes": 357945627,
"md5_checksum": "5373f555206a0f45624aff162fd2dce7",
"url": "/api/data_object/nmdc%3A5373f555206a0f45624aff162fd2dce7/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:520f5fe9d4f56c04e083388698524763",
"name": "gold:Gp0566675_tRNA GFF File",
"description": "tRNA GFF File for gold:Gp0566675",
"file_size_bytes": 1261571,
"md5_checksum": "520f5fe9d4f56c04e083388698524763",
"url": "/api/data_object/nmdc%3A520f5fe9d4f56c04e083388698524763/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:55edeecb043e9de819eb103d23764239",
"name": "gold:Gp0566675_RFAM misc binding GFF file",
"description": "RFAM misc binding GFF file for gold:Gp0566675",
"file_size_bytes": 633637,
"md5_checksum": "55edeecb043e9de819eb103d23764239",
"url": "/api/data_object/nmdc%3A55edeecb043e9de819eb103d23764239/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:5525ecfe503f367e48abe898021630af",
"name": "gold:Gp0566675_RFAM rRNA GFF file",
"description": "RFAM rRNA GFF file for gold:Gp0566675",
"file_size_bytes": 217284,
"md5_checksum": "5525ecfe503f367e48abe898021630af",
"url": "/api/data_object/nmdc%3A5525ecfe503f367e48abe898021630af/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:736b759bcf01ff7b6393351207bcadc2",
"name": "gold:Gp0566675_RFAM rmRNA GFF file",
"description": "RFAM rmRNA GFF file for gold:Gp0566675",
"file_size_bytes": 120987,
"md5_checksum": "736b759bcf01ff7b6393351207bcadc2",
"url": "/api/data_object/nmdc%3A736b759bcf01ff7b6393351207bcadc2/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:99b729018e0a2b9a70dd9bc16d419124",
"name": "gold:Gp0566675_KO_EC GFF file",
"description": "KO_EC GFF file for gold:Gp0566675",
"file_size_bytes": 144776412,
"md5_checksum": "99b729018e0a2b9a70dd9bc16d419124",
"url": "/api/data_object/nmdc%3A99b729018e0a2b9a70dd9bc16d419124/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
}
]
},
{
"id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
"name": "MAGs Analysis Activity for nmdc:mga0xys903",
"type": "nmdc:MAGsAnalysisActivity",
"git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
"started_at_time": "2022-01-05T01:04:43",
"ended_at_time": "2022-01-05T15:56:24",
"execution_resource": "NERSC-Cori",
"omics_processing_id": "gold:Gp0566675",
"outputs": [
{
"id": "nmdc:b0c0e8a3fb844d94a8028d1cf1391ddf",
"name": "gold:Gp0566675_metabat2 bin checkm quality assessment result",
"description": "metabat2 bin checkm quality assessment result for gold:Gp0566675",
"file_size_bytes": 765,
"md5_checksum": "b0c0e8a3fb844d94a8028d1cf1391ddf",
"url": "/api/data_object/nmdc%3Ab0c0e8a3fb844d94a8028d1cf1391ddf/download",
"downloads": 0,
"file_type": "CheckM Statistics",
"file_type_description": "CheckM statistics report",
"selected": false
},
{
"id": "nmdc:a3f49b7c64ce6956f58de8b716e6851c",
"name": "gold:Gp0566675_high-quality and medium-quality bins",
"description": "high-quality and medium-quality bins for gold:Gp0566675",
"file_size_bytes": 283643734,
"md5_checksum": "a3f49b7c64ce6956f58de8b716e6851c",
"url": "/api/data_object/nmdc%3Aa3f49b7c64ce6956f58de8b716e6851c/download",
"downloads": 0,
"file_type": "Metagenome Bins",
"file_type_description": "Metagenome bin contigs fasta",
"selected": false
}
]
},
{
"id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
"name": "ReadBased Analysis Activity for nmdc:mga0xys903",
"type": "nmdc:ReadbasedAnalysis",
"git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
"started_at_time": "2022-01-05T01:04:43",
"ended_at_time": "2022-01-05T15:56:24",
"execution_resource": "NERSC-Cori",
"omics_processing_id": "gold:Gp0566675",
"outputs": [
{
"id": "nmdc:917cb0bccb50b78d08bb40b16ce041b2",
"name": "gold:Gp0566675_Gottcha2 TSV report",
"description": "Gottcha2 TSV report for gold:Gp0566675",
"file_size_bytes": 13642,
"md5_checksum": "917cb0bccb50b78d08bb40b16ce041b2",
"url": "/api/data_object/nmdc%3A917cb0bccb50b78d08bb40b16ce041b2/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:a28043bbb50d143658fabf376069252e",
"name": "gold:Gp0566675_Gottcha2 full TSV report",
"description": "Gottcha2 full TSV report for gold:Gp0566675",
"file_size_bytes": 1379048,
"md5_checksum": "a28043bbb50d143658fabf376069252e",
"url": "/api/data_object/nmdc%3Aa28043bbb50d143658fabf376069252e/download",
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
},
{
"id": "nmdc:bbde2a548568774d69fb81de7b3cd8a0",
"name": "gold:Gp0566675_Gottcha2 Krona HTML report",
"description": "Gottcha2 Krona HTML report for gold:Gp0566675",
"file_size_bytes": 269443,
"md5_checksum": "bbde2a548568774d69fb81de7b3cd8a0",
"url": "/api/data_object/nmdc%3Abbde2a548568774d69fb81de7b3cd8a0/download",
"downloads": 0,
"file_type": "GOTTCHA2 Krona Plot",
"file_type_description": "GOTTCHA2 krona plot HTML file",
"selected": false
},
{
"id": "nmdc:557632d5f7308b8d24be454bb5768efe",
"name": "gold:Gp0566675_Centrifuge classification TSV report",
"description": "Centrifuge classification TSV report for gold:Gp0566675",
"file_size_bytes": 12688495975,
"md5_checksum": "557632d5f7308b8d24be454bb5768efe",
"url": "/api/data_object/nmdc%3A557632d5f7308b8d24be454bb5768efe/download",
"downloads": 0,
"file_type": "Centrifuge Taxonomic Classification",
"file_type_description": "Centrifuge output read classification file",
"selected": false
},
{
"id": "nmdc:df4dc44574d6955359ecd2a3e7840a89",
"name": "gold:Gp0566675_Centrifuge TSV report",
"description": "Centrifuge TSV report for gold:Gp0566675",
"file_size_bytes": 269384,
"md5_checksum": "df4dc44574d6955359ecd2a3e7840a89",
"url": "/api/data_object/nmdc%3Adf4dc44574d6955359ecd2a3e7840a89/download",
"downloads": 0,
"file_type": "Centrifuge Classification Report",
"file_type_description": "Centrifuge output report file",
"selected": false
},
{
"id": "nmdc:5d10a51da4a121bd6784430a5380e64e",
"name": "gold:Gp0566675_Centrifuge Krona HTML report",
"description": "Centrifuge Krona HTML report for gold:Gp0566675",
"file_size_bytes": 2363784,
"md5_checksum": "5d10a51da4a121bd6784430a5380e64e",
"url": "/api/data_object/nmdc%3A5d10a51da4a121bd6784430a5380e64e/download",
"downloads": 0,
"file_type": "Centrifuge Krona Plot",
"file_type_description": "Centrifug krona plot HTML file",
"selected": false
},
{
"id": "nmdc:b88abb551eb44ff80543cc0c746bcd1c",
"name": "gold:Gp0566675_Kraken classification TSV report",
"description": "Kraken classification TSV report for gold:Gp0566675",
"file_size_bytes": 10244299516,
"md5_checksum": "b88abb551eb44ff80543cc0c746bcd1c",
"url": "/api/data_object/nmdc%3Ab88abb551eb44ff80543cc0c746bcd1c/download",
"downloads": 0,
"file_type": "Kraken2 Taxonomic Classification",
"file_type_description": "Kraken2 output read classification file",
"selected": false
},
{
"id": "nmdc:e18e7a581221086ea90e6591ac260dbc",
"name": "gold:Gp0566675_Kraken2 TSV report",
"description": "Kraken2 TSV report for gold:Gp0566675",
"file_size_bytes": 629786,
"md5_checksum": "e18e7a581221086ea90e6591ac260dbc",
"url": "/api/data_object/nmdc%3Ae18e7a581221086ea90e6591ac260dbc/download",
"downloads": 0,
"file_type": "Kraken2 Classification Report",
"file_type_description": "Kraken2 output report file",
"selected": false
},
{
"id": "nmdc:efea0f4cc8ce0313674031eb244db10b",
"name": "gold:Gp0566675_Kraken2 Krona HTML report",
"description": "Kraken2 Krona HTML report for gold:Gp0566675",
"file_size_bytes": 3946508,
"md5_checksum": "efea0f4cc8ce0313674031eb244db10b",
"url": "/api/data_object/nmdc%3Aefea0f4cc8ce0313674031eb244db10b/download",
"downloads": 0,
"file_type": "Kraken2 Krona Plot",
"file_type_description": "Kraken2 krona plot HTML file",
"selected": false
}
]
}
],
"outputs": [
{
"id": "nmdc:76f897f36baa40832bf2ed42eb31b947",
"name": "52550.4.380800.TGATGTCC-TGATGTCC.fastq.gz",
"description": "Raw sequencer read data",
"file_size_bytes": 11740110153,
"md5_checksum": null,
"url": null,
"downloads": 0,
"file_type": null,
"file_type_description": null,
"selected": false
}
]
}
],
"multiomics": 8
}
// ...
]
}
Find the count of biosamples for each geographic location.
POST
https://data.dev.microbiomedata.org/api/biosample/facet
Payload
{
"conditions": [
{
"op": "==",
"field": "principal_investigator_name",
"value": "Mitchel J. Doktycz",
"table": "study"
},
{
"op": "==",
"field": "env_broad_scale",
"value": "terrestrial biome",
"table": "biosample"
},
{
"op": "==",
"field": "env_medium",
"value": "bulk soil",
"table": "biosample"
},
{
"op": "==",
"field": "omics_type",
"value": "Metagenome",
"table": "omics_processing"
},
{
"op": "==",
"field": "processing_institution",
"value": "JGI",
"table": "omics_processing"
}
],
"attribute": "geo_loc_name"
}
Response
{
"facets": {
"USA: Oregon": 103,
"USA: Tennessee": 10
}
}
Find the counts of biosamples collected in each month
POST
https://data.dev.microbiomedata.org/api/biosample/binned_facet
Payload
{
"attribute": "collection_date",
"conditions": [
{
"op": "==",
"field": "principal_investigator_name",
"value": "Mitchel J. Doktycz",
"table": "study"
},
{
"op": "==",
"field": "env_broad_scale",
"value": "terrestrial biome",
"table": "biosample"
},
{
"op": "==",
"field": "env_medium",
"value": "bulk soil",
"table": "biosample"
},
{
"op": "==",
"field": "omics_type",
"value": "Metagenome",
"table": "omics_processing"
},
{
"op": "==",
"field": "processing_institution",
"value": "JGI",
"table": "omics_processing"
}
],
"resolution": "month"
}
Response
{
"facets": [
10,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
103
],
"bins": [
"2014-08-01T00:00:00",
"2014-09-01T00:00:00",
"2014-10-01T00:00:00",
"2014-11-01T00:00:00",
"2014-12-01T00:00:00",
"2015-01-01T00:00:00",
"2015-02-01T00:00:00",
"2015-03-01T00:00:00",
"2015-04-01T00:00:00",
"2015-05-01T00:00:00",
"2015-06-01T00:00:00",
"2015-07-01T00:00:00",
"2015-08-01T00:00:00",
"2015-09-01T00:00:00",
"2015-10-01T00:00:00",
"2015-11-01T00:00:00",
"2015-12-01T00:00:00",
"2016-01-01T00:00:00",
"2016-02-01T00:00:00",
"2016-03-01T00:00:00",
"2016-04-01T00:00:00",
"2016-05-01T00:00:00",
"2016-06-01T00:00:00",
"2016-07-01T00:00:00",
"2016-08-01T00:00:00",
"2016-09-01T00:00:00",
"2016-10-01T00:00:00",
"2016-11-01T00:00:00",
"2016-12-01T00:00:00",
"2017-01-01T00:00:00",
"2017-02-01T00:00:00",
"2017-03-01T00:00:00",
"2017-04-01T00:00:00",
"2017-05-01T00:00:00",
"2017-06-01T00:00:00",
"2017-07-01T00:00:00",
"2017-08-01T00:00:00",
"2017-09-01T00:00:00",
"2017-10-01T00:00:00",
"2017-11-01T00:00:00",
"2017-12-01T00:00:00",
"2018-01-01T00:00:00",
"2018-02-01T00:00:00",
"2018-03-01T00:00:00",
"2018-04-01T00:00:00",
"2018-05-01T00:00:00",
"2018-06-01T00:00:00",
"2018-07-01T00:00:00",
"2018-08-01T00:00:00",
"2018-09-01T00:00:00",
"2018-10-01T00:00:00",
"2018-11-01T00:00:00",
"2018-12-01T00:00:00",
"2019-01-01T00:00:00",
"2019-02-01T00:00:00",
"2019-03-01T00:00:00",
"2019-04-01T00:00:00",
"2019-05-01T00:00:00",
"2019-06-01T00:00:00",
"2019-07-01T00:00:00",
"2019-08-01T00:00:00",
"2019-09-01T00:00:00",
"2019-10-01T00:00:00",
"2019-11-01T00:00:00",
"2019-12-01T00:00:00",
"2020-01-01T00:00:00",
"2020-02-01T00:00:00",
"2020-03-01T00:00:00",
"2020-04-01T00:00:00",
"2020-05-01T00:00:00",
"2020-06-01T00:00:00",
"2020-07-01T00:00:00",
"2020-08-01T00:00:00",
"2020-09-01T00:00:00",
"2020-10-01T00:00:00"
]
}