Proposal: the attribute endpoint #80

nsheff · 2024-06-12T22:27:08Z

Capturing an idea from today's meeting.

Motivation

In various recent discussions, people have re-emphasized the utility of a 'sequences' digest or a 'sorted_sequences' digest. See, for example:

Among other things, these have re-raised some questions, like: should names and lengths really be required? Should we relax the requirement/recommendation of a 'core' schema and just let people use whatever schema they want?

Basically after today's rousing discussion, I see 2 paths forward:

We can de-emphasize the shared schema. This will allow use cases that want to not require names, or make collections with sequences only, to proceed more readily. We were already going down this route, and indeed, the current spec already only lists the schema as RECOMMENDED, but we discussed going even further down that route. The downside is that now everyone would be more likely to just use a custom schema, and interoperability of top-level digests is lost, which feels like a significant loss.
We can introduce a new endpoint... (see new proposal below)

Proposal

If we add another endpoint, we could possibly better accommodate the sequence-only-digest use cases, while maintaining the desirable interoperability of top-level digests we get from a shared core schema. Here it is:

3.3 Attribute

Endpoint: GET /attribute/:attribute_name/:digest (RECOMMENDED?)
Description: Retrieves values of specific attributes in a sequence collection. Here :attribute_name is the name of the attribute, such as sequences, names, or sorted_sequences. :digest is the level 1 digest computed above.
Return value: The attribute value identified by the :digest variable. The structure of the should correspond to the value of the attribute in the canonical structure.

Example /attribute/lengths/:digest return value:

["1216","970","1788"]

Example /attribute/names/:digest return value:

["A","B","C"]

How this helps

With an attribute endpoint, then use cases that have no need for names and lengths could just use level 1 digests for sequences (or sorted_sequences as their primary use case, and these would be interoperable with sequence collection servers. You'd just implement this endpoint instead of the /collection endpoint, and not bother computing the top-level digests, if you had no need for that. If you needed to look up a digest in an external reference provider, you'd just use the /attribute endpoint instead of the /collection endpoint.

We could then move 'sequences' back to required for the core schema, and use this /attribute endpoint to solve the coordinate system problems as well. This would have the advantage of keeping the top-level digests more likely to be interoperable because they'd be more likely to follow the same schema.

The text was updated successfully, but these errors were encountered:

nsheff · 2024-07-29T18:10:57Z

I like the /attribute endpoint, but right now it's restricted to attributes of collections. Should there also be an equivalent endpoint to get at attributes of pangenomes? And if so, should the endpoints change?

I have 2 proposals:

Just use /attribute, which is for attributes of collections, and ignore pangenomes.
Split the endpoint into two, one for pangenome attributes and one for collection attributes. Use something like /collection/attribute/{attribute}/{attribute_digest} and /pangenome/attribute/{attribute}/{attribute_digest}.

andrewyatz · 2024-08-06T17:07:33Z

I did wonder if we should structure as /attribute/{type}/{digest}/{attribute} meaning /attribute/collection/:digest/lengths and /attribute/pangenome/:digest/{attribute} but I think it's not the digest of the collection we're talking about here but the digest of the attributes so restructurin to as you said makes sense. I would though maybe keep /attribute/{type}

nsheff · 2024-08-06T19:19:35Z

So you're suggesting something like this /attribute/{object_type}/{attribute_name}/{attribute_digest}, eg:

/attribute/collection/lengths/{attribute_digest}
/attribute/pangenome/names/{attribute_digest}

sveinugu · 2024-08-07T13:36:24Z

What about removing the word "attribute" completely, and instead add to the /collection and /pangenome endpoints, e.g. /collections/lengths and /pangenome/names ?

nsheff · 2024-08-07T14:03:17Z

Well, I thought the basic REST principle was that the first param in the path parameters would be the thing you're GETting..., hence /attribute

(analogous to /collection and /list )

nsheff · 2024-11-20T19:41:55Z

The /attribute endpoint has been added to the spec and ADR, so I'm going to close this issue.

nsheff mentioned this issue Jun 12, 2024

Use case: a digest for a collection of sequences #76

Open

nsheff added the enhancement New feature or request label Jun 12, 2024

nsheff added this to the v1.1 milestone Jun 26, 2024

nsheff mentioned this issue Aug 6, 2024

Should lengths and names be required properties in every sequence collection ? #72

Open

nsheff mentioned this issue Oct 18, 2024

Advanced attribute qualifiers: passthru and transient attributes #86

Open

nsheff closed this as completed Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: the attribute endpoint #80

Proposal: the attribute endpoint #80

nsheff commented Jun 12, 2024

nsheff commented Jul 29, 2024 •

edited

Loading

andrewyatz commented Aug 6, 2024

nsheff commented Aug 6, 2024

sveinugu commented Aug 7, 2024

nsheff commented Aug 7, 2024 •

edited

Loading

nsheff commented Nov 20, 2024

Proposal: the attribute endpoint #80

Proposal: the attribute endpoint #80

Comments

nsheff commented Jun 12, 2024

Motivation

Proposal

3.3 Attribute

How this helps

nsheff commented Jul 29, 2024 • edited Loading

andrewyatz commented Aug 6, 2024

nsheff commented Aug 6, 2024

sveinugu commented Aug 7, 2024

nsheff commented Aug 7, 2024 • edited Loading

nsheff commented Nov 20, 2024

nsheff commented Jul 29, 2024 •

edited

Loading

nsheff commented Aug 7, 2024 •

edited

Loading