Skip to content

Search API Docs

Matt Covalt edited this page Dec 8, 2022 · 14 revisions

Some resources:

Authenticated endpoints

Some of the API requires authentication. You will be automatically authenticated if you are signed in with your ORCiD account on the data portal website. As long as you have an active session in the portal, you can use the swagger API page.

Programmatic Authentication

This API relies on cookie auth.

  1. Find the session cookie for your current browser login chrome://settings/siteData?searchSubpage=microbiomedata&search=cookies
  2. Your CURL request will look like this: curl -X 'GET' 'https://data.dev.microbiomedata.org/api/metadata_submission?offset=0&limit=25' -H 'accept: application/json' --cookie "session=PASTE_COOKIE_HERE"

Direct ID lookup

Search

Search is constructed by POST request with a JSON body that describes the query.

Example 1:

{"conditions":[{"value":"gold:Gs0114675","table":"study","op":"==","field":"study_id"}],"data_object_filter":[]}

Example 2:

{"conditions":[{"op":"==","field":"omics_type","value":"Organic Matter Characterization","table":"omics_processing"}],"data_object_filter":[]}

Response payload structure

The response JSON structure for most endpoints can be inferred from looking at the TypeScript interfaces.

For example, study search returns a SearchResponse<StudySearchResults>, which can be interpreted as a SearchResponse where the generic result slot is typed as StudySearchResults.

Python

The following code snippet can help construct and run queries from Python.

import requests
import json

class Query(object):
    offset = 0
    limit = 15
    item = None
    filters = []
    results = None

    def __getitem__(self, key):
        if isinstance(key, slice):
            if key.start is None:
                self.offset = 0
            else:
                self.offset = key.start
            if key.stop is None:
                self.limit = 100 - key.start
            else:
                self.limit = key.stop - key.start
            return self
        elif isinstance(key, int):
            self.offset = key
            self.limit = 1
            return self
    
    def __iter__(self):
        # print(self._request().json())
        if self.results is None:
            try:
                self.results = self._request().json()["results"]
            except:
                print(self._request().json())
        for result in self.results:
            yield result

    def filter(self, **kwargs):
        if kwargs.get("item") is not None:
            table = kwargs["item"]
        else:
            table = self.item
        for arg in kwargs:
            if arg == "item":
                continue
            self.filters.append(dict(op="==", field=arg, value=kwargs[arg], table=table))
        return self

    def _request(self):
        url = f"https://data.microbiomedata.org/api/{self.item}/search?offset={self.offset}&limit={self.limit}"
        data = json.dumps(dict(conditions=self.filters))
        # print(url)
        # print(data)
        return requests.post(url, data=data)

class BiosampleQuery(Query):
    item = "biosample"

class StudyQuery(Query):
    item = "study"

class OmicsProcessingQuery(Query):
    item = "omics_processing"

With that code loaded you can perform queries such as:

query = (BiosampleQuery()
    .filter(ecosystem_type="Soil")
    .filter(item="gene_function", id="KEGG.ORTHOLOGY:K00003"))

for item in query[0:5]:
    print(item["name"])

Filters assume the fields apply to the current query type, which in this example allows searching by ecosystem_type without specifying the item. Gene functions, on the other hand, must be joined in so the item="gene_function" is necessary to indicate the id field applies a gene function.

The slicing operation [0:5] on the query above sets an offset and limit on the search to retrieve partial results.

This example prints the download URLs of assembly contig fasta files of soil samples:

query = OmicsProcessingQuery().filter(omics_type="Metagenome").filter(item="biosample", ecosystem_type="Soil")

for item in query[0:20]:
    for omics in item["omics_data"]:
        for output in omics["outputs"]:
            if output["file_type"] == "Assembly Contigs":
                print(f"https://data.microbiomedata.org{output['url']}")

This example finds studies with James Stegen as PI:

query = StudyQuery().filter(principal_investigator_name="James Stegen")

for item in query:
    print(item["name"])

Full Example Searching for Biosamples

The following searches for biosamples based on the following criteria

  • PI name ("Mitchel J. Doktycz")
  • Broad-scale Envioronmental Context ("terrestrial biome" [ENVO:00000446])
  • Environmental medium ("bulk soil" [ENVO:00005802])
  • Omics type ("metagenome")

Search Page

Search

Find all biosamples matching this criteria

POST https://data.dev.microbiomedata.org/api/biosample/search

Try it out!

Payload
{
    "conditions": [
        {
            "op": "==",
            "field": "principal_investigator_name",
            "value": "Mitchel J. Doktycz",
            "table": "study"
        },
        {
            "op": "==",
            "field": "env_broad_scale",
            "value": "terrestrial biome",
            "table": "biosample"
        },
        {
            "op": "==",
            "field": "env_medium",
            "value": "bulk soil",
            "table": "biosample"
        },
        {
            "op": "==",
            "field": "omics_type",
            "value": "Metagenome",
            "table": "omics_processing"
        },
        {
            "op": "==",
            "field": "processing_institution",
            "value": "JGI",
            "table": "omics_processing"
        }
    ],
    "data_object_filter": []
}
Response (truncated)
{
    "count": 113,
    "results": [
        {
            "id": "gold:Gb0291625",
            "name": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA - BESC-388-CL2_50_4",
            "description": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA",
            "alternate_identifiers": [
                "img.taxon:3300046812",
                "gold:Gb0291625"
            ],
            "annotations": {
                "type": "nmdc:Biosample",
                "habitat": "Soil",
                "lat_lon": "46.1216 -123.2701",
                "location": "poplar common garden site in Clatskanie, Oregon, USA",
                "community": "microbial communities",
                "identifier": "BESC-388-CL2_50_4",
                "geo_loc_name": "USA: Oregon",
                "ncbi_taxonomy_name": "soil metagenome",
                "sample_collection_site": "Bulk Soil"
            },
            "study_id": "gold:Gs0154044",
            "depth": null,
            "env_broad_scale_id": "ENVO:00000446",
            "env_local_scale_id": "ENVO:00000011",
            "env_medium_id": "ENVO:00005802",
            "longitude": -123.2701,
            "latitude": 46.1216,
            "add_date": "2021-04-30T00:00:00",
            "mod_date": "2021-06-16T00:00:00",
            "collection_date": "2020-09-09T00:00:00",
            "ecosystem": "Environmental",
            "ecosystem_category": "Terrestrial",
            "ecosystem_type": "Soil",
            "ecosystem_subtype": "Botanical garden",
            "specific_ecosystem": "Bulk soil",
            "open_in_gold": "https://gold.jgi.doe.gov/biosample?id=Gb0291625",
            "env_broad_scale": {
                "id": "ENVO:00000446",
                "label": "terrestrial biome",
                "url": "http://purl.obolibrary.org/obo/ENVO:00000446",
                "data": {}
            },
            "env_local_scale": {
                "id": "ENVO:00000011",
                "label": "garden",
                "url": "http://purl.obolibrary.org/obo/ENVO:00000011",
                "data": {}
            },
            "env_medium": {
                "id": "ENVO:00005802",
                "label": "bulk soil",
                "url": "http://purl.obolibrary.org/obo/ENVO:00005802",
                "data": {}
            },
            "env_broad_scale_terms": [
                "terrestrial biome",
                "biome",
                "ecosystem",
                "environmental system",
                "system",
                "material entity",
                "independent continuant",
                "continuant",
                "entity"
            ],
            "env_local_scale_terms": [
                "garden",
                "anthropogenic geographic feature",
                "geographic feature",
                "astronomical body part",
                "fiat object",
                "material entity",
                "independent continuant",
                "continuant",
                "entity"
            ],
            "env_medium_terms": [
                "bulk soil",
                "soil",
                "environmental material",
                "fiat object",
                "material entity",
                "independent continuant",
                "continuant",
                "entity"
            ],
            "omics_processing": [
                {
                    "id": "gold:Gp0566675",
                    "name": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA - BESC-388-CL2_50_4",
                    "description": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA",
                    "alternate_identifiers": [],
                    "annotations": {
                        "type": "nmdc:OmicsProcessing",
                        "omics_type": "Metagenome",
                        "ncbi_project_name": "Bulk soil microbial communities from poplar common garden site in Clatskanie, Oregon, USA - BESC-388-CL2_50_4",
                        "principal_investigator": "Mitchel J. Doktycz",
                        "processing_institution": "JGI"
                    },
                    "study_id": "gold:Gs0154044",
                    "biosample_id": "gold:Gb0291625",
                    "add_date": "2021-12-23T00:00:00",
                    "mod_date": "2021-12-23T00:00:00",
                    "open_in_gold": "https://gold.jgi.doe.gov/project?id=Gp0566675",
                    "omics_data": [
                        {
                            "id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
                            "name": "Read QC Activity for nmdc:mga0xys903",
                            "type": "nmdc:ReadQCAnalysisActivity",
                            "git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
                            "started_at_time": "2022-01-05T01:04:43",
                            "ended_at_time": "2022-01-05T15:56:24",
                            "execution_resource": "NERSC-Cori",
                            "omics_processing_id": "gold:Gp0566675",
                            "outputs": [
                                {
                                    "id": "nmdc:8b510e76d421c83bed7845501a8918cf",
                                    "name": "gold:Gp0566675_Filtered Reads",
                                    "description": "Filtered Reads for gold:Gp0566675",
                                    "file_size_bytes": 10610637527,
                                    "md5_checksum": "8b510e76d421c83bed7845501a8918cf",
                                    "url": "/api/data_object/nmdc%3A8b510e76d421c83bed7845501a8918cf/download",
                                    "downloads": 0,
                                    "file_type": "Filtered Sequencing Reads",
                                    "file_type_description": "Reads QC result fastq (clean data)",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:dd7082801273dca2dd1da7a5119deda7",
                                    "name": "gold:Gp0566675_Filtered Stats",
                                    "description": "Filtered Stats for gold:Gp0566675",
                                    "file_size_bytes": 3646,
                                    "md5_checksum": "dd7082801273dca2dd1da7a5119deda7",
                                    "url": "/api/data_object/nmdc%3Add7082801273dca2dd1da7a5119deda7/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                }
                            ]
                        },
                        {
                            "id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
                            "name": "Assembly Activity for nmdc:mga0xys903",
                            "type": "nmdc:MetagenomeAssembly",
                            "git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
                            "started_at_time": "2022-01-05T01:04:43",
                            "ended_at_time": "2022-01-05T15:56:24",
                            "execution_resource": "NERSC-Cori",
                            "omics_processing_id": "gold:Gp0566675",
                            "outputs": [
                                {
                                    "id": "nmdc:ff5ebf0b98afeaf8c5cbde3689457ab1",
                                    "name": "gold:Gp0566675_Assembled contigs fasta",
                                    "description": "Assembled contigs fasta for gold:Gp0566675",
                                    "file_size_bytes": 1498528772,
                                    "md5_checksum": "ff5ebf0b98afeaf8c5cbde3689457ab1",
                                    "url": "/api/data_object/nmdc%3Aff5ebf0b98afeaf8c5cbde3689457ab1/download",
                                    "downloads": 0,
                                    "file_type": "Assembly Contigs",
                                    "file_type_description": "Final assembly contigs fasta",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:6588474c6a672969ea116ad47680e9e6",
                                    "name": "gold:Gp0566675_Assembled scaffold fasta",
                                    "description": "Assembled scaffold fasta for gold:Gp0566675",
                                    "file_size_bytes": 1491586035,
                                    "md5_checksum": "6588474c6a672969ea116ad47680e9e6",
                                    "url": "/api/data_object/nmdc%3A6588474c6a672969ea116ad47680e9e6/download",
                                    "downloads": 0,
                                    "file_type": "Assembly Scaffolds",
                                    "file_type_description": "Final assembly scaffolds fasta",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:d41d8cd98f00b204e9800998ecf8427e",
                                    "name": "gold:Gp0452679_metabat2 bin checkm quality assessment result",
                                    "description": "metabat2 bin checkm quality assessment result for gold:Gp0452679",
                                    "file_size_bytes": 0,
                                    "md5_checksum": "d41d8cd98f00b204e9800998ecf8427e",
                                    "url": "/api/data_object/nmdc%3Ad41d8cd98f00b204e9800998ecf8427e/download",
                                    "downloads": 0,
                                    "file_type": "CheckM Statistics",
                                    "file_type_description": "CheckM statistics report",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:7bc999a16ca8e3bfb555d51199d0da8f",
                                    "name": "gold:Gp0566675_Metagenome Alignment BAM file",
                                    "description": "Metagenome Alignment BAM file for gold:Gp0566675",
                                    "file_size_bytes": 12535917798,
                                    "md5_checksum": "7bc999a16ca8e3bfb555d51199d0da8f",
                                    "url": "/api/data_object/nmdc%3A7bc999a16ca8e3bfb555d51199d0da8f/download",
                                    "downloads": 0,
                                    "file_type": "Assembly Coverage BAM",
                                    "file_type_description": "Sorted bam file of reads mapping back to the final assembly",
                                    "selected": false
                                }
                            ]
                        },
                        {
                            "id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
                            "name": "Annotation Activity for nmdc:mga0xys903",
                            "type": "nmdc:MetagenomeAnnotation",
                            "git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
                            "started_at_time": "2022-01-05T01:04:43",
                            "ended_at_time": "2022-01-05T15:56:24",
                            "execution_resource": "NERSC-Cori",
                            "omics_processing_id": "gold:Gp0566675",
                            "outputs": [
                                {
                                    "id": "nmdc:f2fe8b903fbc33dc52b7070d5f025c26",
                                    "name": "gold:Gp0566675_Protein FAA",
                                    "description": "Protein FAA for gold:Gp0566675",
                                    "file_size_bytes": 452210263,
                                    "md5_checksum": "f2fe8b903fbc33dc52b7070d5f025c26",
                                    "url": "/api/data_object/nmdc%3Af2fe8b903fbc33dc52b7070d5f025c26/download",
                                    "downloads": 0,
                                    "file_type": "Annotation Amino Acid FASTA",
                                    "file_type_description": "FASTA amino acid file for annotated proteins",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:d36ff177b4125d3f4d85e23485bed0d9",
                                    "name": "gold:Gp0566675_Structural annotation GFF file",
                                    "description": "Structural annotation GFF file for gold:Gp0566675",
                                    "file_size_bytes": 211518868,
                                    "md5_checksum": "d36ff177b4125d3f4d85e23485bed0d9",
                                    "url": "/api/data_object/nmdc%3Ad36ff177b4125d3f4d85e23485bed0d9/download",
                                    "downloads": 0,
                                    "file_type": "Structural Annotation GFF",
                                    "file_type_description": "GFF3 format file with structural annotations",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:2616f54fe6604bd965f5174c174d0c67",
                                    "name": "gold:Gp0566675_Functional annotation GFF file",
                                    "description": "Functional annotation GFF file for gold:Gp0566675",
                                    "file_size_bytes": 387519457,
                                    "md5_checksum": "2616f54fe6604bd965f5174c174d0c67",
                                    "url": "/api/data_object/nmdc%3A2616f54fe6604bd965f5174c174d0c67/download",
                                    "downloads": 0,
                                    "file_type": "Functional Annotation GFF",
                                    "file_type_description": "GFF3 format file with functional annotations",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:419f975debb1e88caddfa1859c99cddb",
                                    "name": "gold:Gp0566675_KO TSV file",
                                    "description": "KO TSV file for gold:Gp0566675",
                                    "file_size_bytes": 44909434,
                                    "md5_checksum": "419f975debb1e88caddfa1859c99cddb",
                                    "url": "/api/data_object/nmdc%3A419f975debb1e88caddfa1859c99cddb/download",
                                    "downloads": 0,
                                    "file_type": "Annotation KEGG Orthology",
                                    "file_type_description": "Tab delimited file for KO annotation",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:c84ba3a661301b45bbb6b56d889a956b",
                                    "name": "gold:Gp0566675_EC TSV file",
                                    "description": "EC TSV file for gold:Gp0566675",
                                    "file_size_bytes": 29285396,
                                    "md5_checksum": "c84ba3a661301b45bbb6b56d889a956b",
                                    "url": "/api/data_object/nmdc%3Ac84ba3a661301b45bbb6b56d889a956b/download",
                                    "downloads": 0,
                                    "file_type": "Annotation Enzyme Commission",
                                    "file_type_description": "Tab delimited file for EC annotation",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:089df71a6e2994db942945a05122bbfc",
                                    "name": "gold:Gp0566675_COG GFF file",
                                    "description": "COG GFF file for gold:Gp0566675",
                                    "file_size_bytes": 243168059,
                                    "md5_checksum": "089df71a6e2994db942945a05122bbfc",
                                    "url": "/api/data_object/nmdc%3A089df71a6e2994db942945a05122bbfc/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:99ef3156c471c206d762eb6264f8354a",
                                    "name": "gold:Gp0566675_PFAM GFF file",
                                    "description": "PFAM GFF file for gold:Gp0566675",
                                    "file_size_bytes": 234457003,
                                    "md5_checksum": "99ef3156c471c206d762eb6264f8354a",
                                    "url": "/api/data_object/nmdc%3A99ef3156c471c206d762eb6264f8354a/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:dffd918b20d93009a036cf263dc10f97",
                                    "name": "gold:Gp0566675_TigrFam GFF file",
                                    "description": "TigrFam GFF file for gold:Gp0566675",
                                    "file_size_bytes": 33550202,
                                    "md5_checksum": "dffd918b20d93009a036cf263dc10f97",
                                    "url": "/api/data_object/nmdc%3Adffd918b20d93009a036cf263dc10f97/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:bcc0b66e2d64c26413aa09af4f92dee2",
                                    "name": "gold:Gp0566675_SMART GFF file",
                                    "description": "SMART GFF file for gold:Gp0566675",
                                    "file_size_bytes": 65877391,
                                    "md5_checksum": "bcc0b66e2d64c26413aa09af4f92dee2",
                                    "url": "/api/data_object/nmdc%3Abcc0b66e2d64c26413aa09af4f92dee2/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:9b4bfcafe1a92151f2c9bb4fb626aef9",
                                    "name": "gold:Gp0566675_SuperFam GFF file",
                                    "description": "SuperFam GFF file for gold:Gp0566675",
                                    "file_size_bytes": 296375492,
                                    "md5_checksum": "9b4bfcafe1a92151f2c9bb4fb626aef9",
                                    "url": "/api/data_object/nmdc%3A9b4bfcafe1a92151f2c9bb4fb626aef9/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:43922b8d0f06dc9477e66537ba176dd3",
                                    "name": "gold:Gp0566675_Cath FunFam GFF file",
                                    "description": "Cath FunFam GFF file for gold:Gp0566675",
                                    "file_size_bytes": 262861674,
                                    "md5_checksum": "43922b8d0f06dc9477e66537ba176dd3",
                                    "url": "/api/data_object/nmdc%3A43922b8d0f06dc9477e66537ba176dd3/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:db4d0e939563ce9a94ecd47bc3ef402c",
                                    "name": "gold:Gp0566675_CRT GFF file",
                                    "description": "CRT GFF file for gold:Gp0566675",
                                    "file_size_bytes": 89820,
                                    "md5_checksum": "db4d0e939563ce9a94ecd47bc3ef402c",
                                    "url": "/api/data_object/nmdc%3Adb4d0e939563ce9a94ecd47bc3ef402c/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:e84c2d487b63505758057f580e03447c",
                                    "name": "gold:Gp0566675_Genemark GFF file",
                                    "description": "Genemark GFF file for gold:Gp0566675",
                                    "file_size_bytes": 280641652,
                                    "md5_checksum": "e84c2d487b63505758057f580e03447c",
                                    "url": "/api/data_object/nmdc%3Ae84c2d487b63505758057f580e03447c/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:5373f555206a0f45624aff162fd2dce7",
                                    "name": "gold:Gp0566675_Prodigal GFF file",
                                    "description": "Prodigal GFF file for gold:Gp0566675",
                                    "file_size_bytes": 357945627,
                                    "md5_checksum": "5373f555206a0f45624aff162fd2dce7",
                                    "url": "/api/data_object/nmdc%3A5373f555206a0f45624aff162fd2dce7/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:520f5fe9d4f56c04e083388698524763",
                                    "name": "gold:Gp0566675_tRNA GFF File",
                                    "description": "tRNA GFF File for gold:Gp0566675",
                                    "file_size_bytes": 1261571,
                                    "md5_checksum": "520f5fe9d4f56c04e083388698524763",
                                    "url": "/api/data_object/nmdc%3A520f5fe9d4f56c04e083388698524763/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:55edeecb043e9de819eb103d23764239",
                                    "name": "gold:Gp0566675_RFAM misc binding GFF file",
                                    "description": "RFAM misc binding GFF file for gold:Gp0566675",
                                    "file_size_bytes": 633637,
                                    "md5_checksum": "55edeecb043e9de819eb103d23764239",
                                    "url": "/api/data_object/nmdc%3A55edeecb043e9de819eb103d23764239/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:5525ecfe503f367e48abe898021630af",
                                    "name": "gold:Gp0566675_RFAM rRNA GFF file",
                                    "description": "RFAM rRNA GFF file for gold:Gp0566675",
                                    "file_size_bytes": 217284,
                                    "md5_checksum": "5525ecfe503f367e48abe898021630af",
                                    "url": "/api/data_object/nmdc%3A5525ecfe503f367e48abe898021630af/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:736b759bcf01ff7b6393351207bcadc2",
                                    "name": "gold:Gp0566675_RFAM rmRNA GFF file",
                                    "description": "RFAM rmRNA GFF file for gold:Gp0566675",
                                    "file_size_bytes": 120987,
                                    "md5_checksum": "736b759bcf01ff7b6393351207bcadc2",
                                    "url": "/api/data_object/nmdc%3A736b759bcf01ff7b6393351207bcadc2/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:99b729018e0a2b9a70dd9bc16d419124",
                                    "name": "gold:Gp0566675_KO_EC GFF file",
                                    "description": "KO_EC GFF file for gold:Gp0566675",
                                    "file_size_bytes": 144776412,
                                    "md5_checksum": "99b729018e0a2b9a70dd9bc16d419124",
                                    "url": "/api/data_object/nmdc%3A99b729018e0a2b9a70dd9bc16d419124/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                }
                            ]
                        },
                        {
                            "id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
                            "name": "MAGs Analysis Activity for nmdc:mga0xys903",
                            "type": "nmdc:MAGsAnalysisActivity",
                            "git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
                            "started_at_time": "2022-01-05T01:04:43",
                            "ended_at_time": "2022-01-05T15:56:24",
                            "execution_resource": "NERSC-Cori",
                            "omics_processing_id": "gold:Gp0566675",
                            "outputs": [
                                {
                                    "id": "nmdc:b0c0e8a3fb844d94a8028d1cf1391ddf",
                                    "name": "gold:Gp0566675_metabat2 bin checkm quality assessment result",
                                    "description": "metabat2 bin checkm quality assessment result for gold:Gp0566675",
                                    "file_size_bytes": 765,
                                    "md5_checksum": "b0c0e8a3fb844d94a8028d1cf1391ddf",
                                    "url": "/api/data_object/nmdc%3Ab0c0e8a3fb844d94a8028d1cf1391ddf/download",
                                    "downloads": 0,
                                    "file_type": "CheckM Statistics",
                                    "file_type_description": "CheckM statistics report",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:a3f49b7c64ce6956f58de8b716e6851c",
                                    "name": "gold:Gp0566675_high-quality and medium-quality bins",
                                    "description": "high-quality and medium-quality bins for gold:Gp0566675",
                                    "file_size_bytes": 283643734,
                                    "md5_checksum": "a3f49b7c64ce6956f58de8b716e6851c",
                                    "url": "/api/data_object/nmdc%3Aa3f49b7c64ce6956f58de8b716e6851c/download",
                                    "downloads": 0,
                                    "file_type": "Metagenome Bins",
                                    "file_type_description": "Metagenome bin contigs fasta",
                                    "selected": false
                                }
                            ]
                        },
                        {
                            "id": "nmdc:a96f6db483678ec53627749ea3f4a6e7",
                            "name": "ReadBased Analysis Activity for nmdc:mga0xys903",
                            "type": "nmdc:ReadbasedAnalysis",
                            "git_url": "https://github.com/microbiomedata/mg_annotation/releases/tag/0.1",
                            "started_at_time": "2022-01-05T01:04:43",
                            "ended_at_time": "2022-01-05T15:56:24",
                            "execution_resource": "NERSC-Cori",
                            "omics_processing_id": "gold:Gp0566675",
                            "outputs": [
                                {
                                    "id": "nmdc:917cb0bccb50b78d08bb40b16ce041b2",
                                    "name": "gold:Gp0566675_Gottcha2 TSV report",
                                    "description": "Gottcha2 TSV report for gold:Gp0566675",
                                    "file_size_bytes": 13642,
                                    "md5_checksum": "917cb0bccb50b78d08bb40b16ce041b2",
                                    "url": "/api/data_object/nmdc%3A917cb0bccb50b78d08bb40b16ce041b2/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:a28043bbb50d143658fabf376069252e",
                                    "name": "gold:Gp0566675_Gottcha2 full TSV report",
                                    "description": "Gottcha2 full TSV report for gold:Gp0566675",
                                    "file_size_bytes": 1379048,
                                    "md5_checksum": "a28043bbb50d143658fabf376069252e",
                                    "url": "/api/data_object/nmdc%3Aa28043bbb50d143658fabf376069252e/download",
                                    "downloads": 0,
                                    "file_type": null,
                                    "file_type_description": null,
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:bbde2a548568774d69fb81de7b3cd8a0",
                                    "name": "gold:Gp0566675_Gottcha2 Krona HTML report",
                                    "description": "Gottcha2 Krona HTML report for gold:Gp0566675",
                                    "file_size_bytes": 269443,
                                    "md5_checksum": "bbde2a548568774d69fb81de7b3cd8a0",
                                    "url": "/api/data_object/nmdc%3Abbde2a548568774d69fb81de7b3cd8a0/download",
                                    "downloads": 0,
                                    "file_type": "GOTTCHA2 Krona Plot",
                                    "file_type_description": "GOTTCHA2 krona plot HTML file",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:557632d5f7308b8d24be454bb5768efe",
                                    "name": "gold:Gp0566675_Centrifuge classification TSV report",
                                    "description": "Centrifuge classification TSV report for gold:Gp0566675",
                                    "file_size_bytes": 12688495975,
                                    "md5_checksum": "557632d5f7308b8d24be454bb5768efe",
                                    "url": "/api/data_object/nmdc%3A557632d5f7308b8d24be454bb5768efe/download",
                                    "downloads": 0,
                                    "file_type": "Centrifuge Taxonomic Classification",
                                    "file_type_description": "Centrifuge output read classification file",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:df4dc44574d6955359ecd2a3e7840a89",
                                    "name": "gold:Gp0566675_Centrifuge TSV report",
                                    "description": "Centrifuge TSV report for gold:Gp0566675",
                                    "file_size_bytes": 269384,
                                    "md5_checksum": "df4dc44574d6955359ecd2a3e7840a89",
                                    "url": "/api/data_object/nmdc%3Adf4dc44574d6955359ecd2a3e7840a89/download",
                                    "downloads": 0,
                                    "file_type": "Centrifuge Classification Report",
                                    "file_type_description": "Centrifuge output report file",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:5d10a51da4a121bd6784430a5380e64e",
                                    "name": "gold:Gp0566675_Centrifuge Krona HTML report",
                                    "description": "Centrifuge Krona HTML report for gold:Gp0566675",
                                    "file_size_bytes": 2363784,
                                    "md5_checksum": "5d10a51da4a121bd6784430a5380e64e",
                                    "url": "/api/data_object/nmdc%3A5d10a51da4a121bd6784430a5380e64e/download",
                                    "downloads": 0,
                                    "file_type": "Centrifuge Krona Plot",
                                    "file_type_description": "Centrifug krona plot HTML file",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:b88abb551eb44ff80543cc0c746bcd1c",
                                    "name": "gold:Gp0566675_Kraken classification TSV report",
                                    "description": "Kraken classification TSV report for gold:Gp0566675",
                                    "file_size_bytes": 10244299516,
                                    "md5_checksum": "b88abb551eb44ff80543cc0c746bcd1c",
                                    "url": "/api/data_object/nmdc%3Ab88abb551eb44ff80543cc0c746bcd1c/download",
                                    "downloads": 0,
                                    "file_type": "Kraken2 Taxonomic Classification",
                                    "file_type_description": "Kraken2 output read classification file",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:e18e7a581221086ea90e6591ac260dbc",
                                    "name": "gold:Gp0566675_Kraken2 TSV report",
                                    "description": "Kraken2 TSV report for gold:Gp0566675",
                                    "file_size_bytes": 629786,
                                    "md5_checksum": "e18e7a581221086ea90e6591ac260dbc",
                                    "url": "/api/data_object/nmdc%3Ae18e7a581221086ea90e6591ac260dbc/download",
                                    "downloads": 0,
                                    "file_type": "Kraken2 Classification Report",
                                    "file_type_description": "Kraken2 output report file",
                                    "selected": false
                                },
                                {
                                    "id": "nmdc:efea0f4cc8ce0313674031eb244db10b",
                                    "name": "gold:Gp0566675_Kraken2 Krona HTML report",
                                    "description": "Kraken2 Krona HTML report for gold:Gp0566675",
                                    "file_size_bytes": 3946508,
                                    "md5_checksum": "efea0f4cc8ce0313674031eb244db10b",
                                    "url": "/api/data_object/nmdc%3Aefea0f4cc8ce0313674031eb244db10b/download",
                                    "downloads": 0,
                                    "file_type": "Kraken2 Krona Plot",
                                    "file_type_description": "Kraken2 krona plot HTML file",
                                    "selected": false
                                }
                            ]
                        }
                    ],
                    "outputs": [
                        {
                            "id": "nmdc:76f897f36baa40832bf2ed42eb31b947",
                            "name": "52550.4.380800.TGATGTCC-TGATGTCC.fastq.gz",
                            "description": "Raw sequencer read data",
                            "file_size_bytes": 11740110153,
                            "md5_checksum": null,
                            "url": null,
                            "downloads": 0,
                            "file_type": null,
                            "file_type_description": null,
                            "selected": false
                        }
                    ]
                }
            ],
            "multiomics": 8
        }
        // ...
    ]
}

Facet

Find the count of biosamples for each geographic location.

POST https://data.dev.microbiomedata.org/api/biosample/facet

Try it out!

Payload
{
    "conditions": [
        {
            "op": "==",
            "field": "principal_investigator_name",
            "value": "Mitchel J. Doktycz",
            "table": "study"
        },
        {
            "op": "==",
            "field": "env_broad_scale",
            "value": "terrestrial biome",
            "table": "biosample"
        },
        {
            "op": "==",
            "field": "env_medium",
            "value": "bulk soil",
            "table": "biosample"
        },
        {
            "op": "==",
            "field": "omics_type",
            "value": "Metagenome",
            "table": "omics_processing"
        },
        {
            "op": "==",
            "field": "processing_institution",
            "value": "JGI",
            "table": "omics_processing"
        }
    ],
    "attribute": "geo_loc_name"
}
Response
{
    "facets": {
        "USA: Oregon": 103,
        "USA: Tennessee": 10
    }
}

Binned Facet

Find the counts of biosamples collected in each month

POST https://data.dev.microbiomedata.org/api/biosample/binned_facet

Try it out!

Payload
{
    "attribute": "collection_date",
    "conditions": [
        {
            "op": "==",
            "field": "principal_investigator_name",
            "value": "Mitchel J. Doktycz",
            "table": "study"
        },
        {
            "op": "==",
            "field": "env_broad_scale",
            "value": "terrestrial biome",
            "table": "biosample"
        },
        {
            "op": "==",
            "field": "env_medium",
            "value": "bulk soil",
            "table": "biosample"
        },
        {
            "op": "==",
            "field": "omics_type",
            "value": "Metagenome",
            "table": "omics_processing"
        },
        {
            "op": "==",
            "field": "processing_institution",
            "value": "JGI",
            "table": "omics_processing"
        }
    ],
    "resolution": "month"
}
Response
{
    "facets": [
        10,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        103
    ],
    "bins": [
        "2014-08-01T00:00:00",
        "2014-09-01T00:00:00",
        "2014-10-01T00:00:00",
        "2014-11-01T00:00:00",
        "2014-12-01T00:00:00",
        "2015-01-01T00:00:00",
        "2015-02-01T00:00:00",
        "2015-03-01T00:00:00",
        "2015-04-01T00:00:00",
        "2015-05-01T00:00:00",
        "2015-06-01T00:00:00",
        "2015-07-01T00:00:00",
        "2015-08-01T00:00:00",
        "2015-09-01T00:00:00",
        "2015-10-01T00:00:00",
        "2015-11-01T00:00:00",
        "2015-12-01T00:00:00",
        "2016-01-01T00:00:00",
        "2016-02-01T00:00:00",
        "2016-03-01T00:00:00",
        "2016-04-01T00:00:00",
        "2016-05-01T00:00:00",
        "2016-06-01T00:00:00",
        "2016-07-01T00:00:00",
        "2016-08-01T00:00:00",
        "2016-09-01T00:00:00",
        "2016-10-01T00:00:00",
        "2016-11-01T00:00:00",
        "2016-12-01T00:00:00",
        "2017-01-01T00:00:00",
        "2017-02-01T00:00:00",
        "2017-03-01T00:00:00",
        "2017-04-01T00:00:00",
        "2017-05-01T00:00:00",
        "2017-06-01T00:00:00",
        "2017-07-01T00:00:00",
        "2017-08-01T00:00:00",
        "2017-09-01T00:00:00",
        "2017-10-01T00:00:00",
        "2017-11-01T00:00:00",
        "2017-12-01T00:00:00",
        "2018-01-01T00:00:00",
        "2018-02-01T00:00:00",
        "2018-03-01T00:00:00",
        "2018-04-01T00:00:00",
        "2018-05-01T00:00:00",
        "2018-06-01T00:00:00",
        "2018-07-01T00:00:00",
        "2018-08-01T00:00:00",
        "2018-09-01T00:00:00",
        "2018-10-01T00:00:00",
        "2018-11-01T00:00:00",
        "2018-12-01T00:00:00",
        "2019-01-01T00:00:00",
        "2019-02-01T00:00:00",
        "2019-03-01T00:00:00",
        "2019-04-01T00:00:00",
        "2019-05-01T00:00:00",
        "2019-06-01T00:00:00",
        "2019-07-01T00:00:00",
        "2019-08-01T00:00:00",
        "2019-09-01T00:00:00",
        "2019-10-01T00:00:00",
        "2019-11-01T00:00:00",
        "2019-12-01T00:00:00",
        "2020-01-01T00:00:00",
        "2020-02-01T00:00:00",
        "2020-03-01T00:00:00",
        "2020-04-01T00:00:00",
        "2020-05-01T00:00:00",
        "2020-06-01T00:00:00",
        "2020-07-01T00:00:00",
        "2020-08-01T00:00:00",
        "2020-09-01T00:00:00",
        "2020-10-01T00:00:00"
    ]
}