Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record logs into step artifacts #1339
Record logs into step artifacts #1339
Changes from 9 commits
79e5763
c3cbf4b
f3debf5
c1e7ff6
b1f14c6
b2cb9c4
eb6fc3d
57cd772
419e12b
ddbc340
f313326
9b35651
2d90fed
be0e817
5ee8c3b
f8a6b58
8bf863c
5d32d53
3c37a4a
1aa1146
f3bb2fe
6606fba
aabc0fc
b83c648
40e9dd3
5c8488d
127b8a6
4742fc4
974b37e
d48f792
f57c7b6
c8274c3
3ff49cb
c3ce7d3
72b3f15
aac781f
ff86b52
a86d81d
0f0633f
727dd7e
2312281
d66e67b
2cf5f69
63829fa
9eda7ca
631f898
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR is a good start.
There are a couple of problems i'm seeing right now (not because of the code here) that we need keep iterating:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wintonzheng are we certain that each log must have a step_id?
From what I see in the logs when I ran tasks there are several logs in the beginning that don't belong to any step:
Here first 7 logs would be without a step because the first step is not created yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wintonzheng do I undestand correctly that we want to create task and step ids in advance?
This way the logs that are currently not associated with a step or task because step/task not created yet would be associated with the first task and first step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, step_id needs to be manually defined in
LOG.info("xxx", step_id="stp_xxxx")
nowadays. those logs you shared here don't belong to any step so it's okay many logs don't havestep_id
.yep, exactly what i mean. we need to backfill step_id and task_id in some logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we can also capture task and workflow level logs? Even if we don't render them in the UI today
We historically associate logs like that w/ the first step, although @wintonzheng is currently overhauling it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we just added new columns to the artifacts db: https://github.com/Skyvern-AI/skyvern/blob/main/skyvern/forge/sdk/db/models.py#L169-L174
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/Skyvern-AI/skyvern/pull/1345/files
here's the example of supporting artifact upload for observer thought.
going forward we can add support for task level and workflow run level artifacts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step log..
Task log..
Workflow log...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think adding STEP_LOG, TASK_LOG, WORKFLOW_LOG helps.
It's better to use step_id, task_id and workflow_run_id columns to filter artifacts. Also, step log and task log are not mutually exclusive. TaskLogs should be the super set of StepLogs. WorkflowRunLogs should be the super set of TaskLogs. We don't have to create multiple sets of log artifacts that have duplicated data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I agree with this -- why not have multiple artifact types that may or may not include duplicate data? It makes the implementation + writing of data + reading of data dead simple.
Why is it better to use step_id / task_id / workflow_run_id columns to filter artifacts w/ the same name + type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly in this case we'll store workflow and task level logs in the step artifact folders
Current implementation of
bulid
uri always assumes step:We could define a build uri method for logs that would take entity type and id
So that the uri would look like this:
This way we would be able to fetch logs on the frontend using a similar query as for the other artifacts
We would be able to use artifact type for all the levels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suchintan no concern of data duplication. you're wright, having more artifact_types makes the implementation easier.