Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Impl Resource sync interface in agent side #2529

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fregataa
Copy link
Member

@fregataa fregataa commented Jul 22, 2024

Agents's sync-and-get-kernels() API
The API that synchronizes agent's kernels to kernel information specified by API parameters (preparing_kernels, pulling_kernels, running_kernels, terminating_kernels). It assumes that the kernel information given by the parameter is the "truth".
If any of kernel information mismatch between kernel_registry and running_kernels(or terminating_kernels), agent injects termination event to terminate the kernel.

sync-and-get-kernels() API returns actual { running, terminating, terminated } kernels (which is not used for now). actual_terminated_kernels contains terminated kernels specified as running_kernels by API parameter.

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version

@github-actions github-actions bot added comp:manager Related to Manager component comp:agent Related to Agent component comp:common Related to Common component size:L 100~500 LoC labels Jul 22, 2024
@fregataa fregataa added this to the 24.09 milestone Jul 22, 2024
@fregataa fregataa added the skip:changelog Make the action workflow to skip towncrier check label Jul 22, 2024
@fregataa fregataa requested a review from achimnol July 22, 2024 09:57
@fregataa fregataa marked this pull request as ready for review July 22, 2024 09:57
@fregataa fregataa force-pushed the topic/07-22-feat_implement_resource_sync_interface branch 3 times, most recently from 826ec90 to b590466 Compare August 8, 2024 03:59
Copy link
Member

@achimnol achimnol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early reviews!

@@ -711,6 +711,19 @@ def check_and_return(self, value: Any) -> set:
self._failure("value must be Iterable")


class ToList(t.List):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just reuse trafaret.base.List?

Copy link
Member Author

@fregataa fregataa Aug 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trafaret.base.List allows list type data only and I want to allow any iterable data type here since sequential data fetched from RPC response get deserialized into tuple.
image.png

src/ai/backend/common/types.py Outdated Show resolved Hide resolved
@fregataa fregataa force-pushed the topic/07-22-feat_implement_resource_sync_interface branch from b590466 to ecb36ae Compare August 10, 2024 13:47
@fregataa fregataa force-pushed the topic/07-22-feat_implement_resource_sync_interface branch from ecb36ae to 82b4f15 Compare August 25, 2024 06:23
@fregataa fregataa requested a review from achimnol August 25, 2024 06:27
@fregataa fregataa marked this pull request as draft October 21, 2024 02:21
@fregataa fregataa modified the milestones: 24.09, 24.12 Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:agent Related to Agent component comp:common Related to Common component comp:manager Related to Manager component size:L 100~500 LoC skip:changelog Make the action workflow to skip towncrier check
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants