Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: the attribute endpoint #80

Closed
nsheff opened this issue Jun 12, 2024 · 6 comments
Closed

Proposal: the attribute endpoint #80

nsheff opened this issue Jun 12, 2024 · 6 comments
Labels
enhancement New feature or request
Milestone

Comments

@nsheff
Copy link
Member

nsheff commented Jun 12, 2024

Capturing an idea from today's meeting.

Motivation

In various recent discussions, people have re-emphasized the utility of a 'sequences' digest or a 'sorted_sequences' digest. See, for example:

Among other things, these have re-raised some questions, like: should names and lengths really be required? Should we relax the requirement/recommendation of a 'core' schema and just let people use whatever schema they want?

Basically after today's rousing discussion, I see 2 paths forward:

  1. We can de-emphasize the shared schema. This will allow use cases that want to not require names, or make collections with sequences only, to proceed more readily. We were already going down this route, and indeed, the current spec already only lists the schema as RECOMMENDED, but we discussed going even further down that route. The downside is that now everyone would be more likely to just use a custom schema, and interoperability of top-level digests is lost, which feels like a significant loss.
  2. We can introduce a new endpoint... (see new proposal below)

Proposal

If we add another endpoint, we could possibly better accommodate the sequence-only-digest use cases, while maintaining the desirable interoperability of top-level digests we get from a shared core schema. Here it is:

3.3 Attribute

  • Endpoint: GET /attribute/:attribute_name/:digest (RECOMMENDED?)
  • Description: Retrieves values of specific attributes in a sequence collection. Here :attribute_name is the name of the attribute, such as sequences, names, or sorted_sequences. :digest is the level 1 digest computed above.
  • Return value: The attribute value identified by the :digest variable. The structure of the should correspond to the value of the attribute in the canonical structure.

Example /attribute/lengths/:digest return value:

["1216","970","1788"]

Example /attribute/names/:digest return value:

["A","B","C"]

How this helps

With an attribute endpoint, then use cases that have no need for names and lengths could just use level 1 digests for sequences (or sorted_sequences as their primary use case, and these would be interoperable with sequence collection servers. You'd just implement this endpoint instead of the /collection endpoint, and not bother computing the top-level digests, if you had no need for that. If you needed to look up a digest in an external reference provider, you'd just use the /attribute endpoint instead of the /collection endpoint.

We could then move 'sequences' back to required for the core schema, and use this /attribute endpoint to solve the coordinate system problems as well. This would have the advantage of keeping the top-level digests more likely to be interoperable because they'd be more likely to follow the same schema.

@nsheff nsheff added the enhancement New feature or request label Jun 12, 2024
@nsheff nsheff added this to the v1.1 milestone Jun 26, 2024
@nsheff
Copy link
Member Author

nsheff commented Jul 29, 2024

I like the /attribute endpoint, but right now it's restricted to attributes of collections. Should there also be an equivalent endpoint to get at attributes of pangenomes? And if so, should the endpoints change?

I have 2 proposals:

  1. Just use /attribute, which is for attributes of collections, and ignore pangenomes.
  2. Split the endpoint into two, one for pangenome attributes and one for collection attributes. Use something like /collection/attribute/{attribute}/{attribute_digest} and /pangenome/attribute/{attribute}/{attribute_digest}.

@andrewyatz
Copy link
Collaborator

I did wonder if we should structure as /attribute/{type}/{digest}/{attribute} meaning /attribute/collection/:digest/lengths and /attribute/pangenome/:digest/{attribute} but I think it's not the digest of the collection we're talking about here but the digest of the attributes so restructurin to as you said makes sense. I would though maybe keep /attribute/{type}

@nsheff
Copy link
Member Author

nsheff commented Aug 6, 2024

So you're suggesting something like this /attribute/{object_type}/{attribute_name}/{attribute_digest}, eg:

  • /attribute/collection/lengths/{attribute_digest}
  • /attribute/pangenome/names/{attribute_digest}

@sveinugu
Copy link
Collaborator

sveinugu commented Aug 7, 2024

What about removing the word "attribute" completely, and instead add to the /collection and /pangenome endpoints, e.g. /collections/lengths and /pangenome/names ?

@nsheff
Copy link
Member Author

nsheff commented Aug 7, 2024

Well, I thought the basic REST principle was that the first param in the path parameters would be the thing you're GETting..., hence /attribute

(analogous to /collection and /list )

@nsheff
Copy link
Member Author

nsheff commented Nov 20, 2024

The /attribute endpoint has been added to the spec and ADR, so I'm going to close this issue.

@nsheff nsheff closed this as completed Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants