Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add log signal release note and update docs #10126

Merged
merged 7 commits into from
Oct 25, 2024

Conversation

jgongd
Copy link
Contributor

@jgongd jgongd commented Oct 25, 2024

Ticket

MD-531

Description

  • Log signal release note
  • Update log signal doc

Test Plan

Checklist

  • Changes have been manually QA'd
  • New features have been approved by the corresponding PM
  • User-facing API changes have the "User-facing API Change" label
  • Release notes have been added as a separate file under docs/release-notes/
    See Release Note for details.
  • Licenses have been included for new code which was copied and/or modified from any external code

@cla-bot cla-bot bot added the cla-signed label Oct 25, 2024
@determined-ci determined-ci requested a review from a team October 25, 2024 01:47
@determined-ci determined-ci added the documentation Improvements or additions to documentation label Oct 25, 2024
@jgongd jgongd requested review from gt2345 and removed request for a team October 25, 2024 01:47
Copy link

netlify bot commented Oct 25, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit 72b9281
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/671bd6b9be35d800088b97e5

Comment on lines 9 to 10
It has a new format. ``name`` is required. ``pattern`` and ``action`` are optional. To make
things simpler, user no longer needs to specify the ``type`` field to set an action. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has a new format. name is required, and action should be a plain string. See for more details.

Then skip the rest of the text.

The release note should be a brief note and interested users should go to the main docs.


- ``pattern``: Required. The regex pattern to match in the logs.
- ``pattern``: Optional. The regex pattern to match in the logs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth explaining why pattern would ever not be provided.

@jgongd jgongd force-pushed the jgong/doc/log-signal-release-note branch from 586588f to 79a371f Compare October 25, 2024 15:38
@determined-ci determined-ci requested a review from a team October 25, 2024 15:38
@@ -40,10 +40,9 @@ Example configuration:
.. code:: yaml

log_policies:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the log signal feature is a visual cue to the original fault tolerant project(exclude_node or cancel_retires when there are serious errors). The main goal is to make it easier for users to know whether a log policy has executed without reading the log.

log_policies is good at:

log_policies:
  - name: "Encountered Serious Error"
    pattern: ".*serious error.*"
    action: cancel_retires

We might not want to steer users toward doing something like below, since the backend and WebUI weren't really designed for this use case:

log_policies:
  - name: "stringA appears in the log"
    pattern: ".*stringA.*"
  - name: "stringB appears in the log"
    pattern: ".*stringB.*"

@@ -308,10 +308,11 @@ Optional. Defines actions and labels in response to trial logs matching specifie
language syntax). For more information about the syntax, you can visit this `RE2 reference page
<https://github.com/google/re2/wiki/Syntax>`__. Each log policy can have the following fields:

- ``name``: Optional. A name for the log policy. If provided, this name will be displayed as a
label in the UI when the log policy matches.
- ``name``: Required. A name for the log policy. This name will be displayed as a label in the UI
Copy link
Contributor

@tara-hpe tara-hpe Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ``name``: Required. A name for the log policy. This name will be displayed as a label in the UI
- ``name``: Required. The name of the log policy, displayed as a label in the WebUI

- ``name``: Optional. A name for the log policy. If provided, this name will be displayed as a
label in the UI when the log policy matches.
- ``name``: Required. A name for the log policy. This name will be displayed as a label in the UI
when the log policy matches.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
when the log policy matches.
when a log match occurs.


- ``pattern``: Required. The regex pattern to match in the logs.
- ``pattern``: Optional. It is required to provide a regex pattern to match in the logs unless the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ``pattern``: Optional. It is required to provide a regex pattern to match in the logs unless the
- ``pattern``: Optional. Defines a regex pattern to match log entries. If not specified, this policy is disabled.


- ``pattern``: Required. The regex pattern to match in the logs.
- ``pattern``: Optional. It is required to provide a regex pattern to match in the logs unless the
intention is to disable an existing policy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
intention is to disable an existing policy.

pattern: ".*CUDA out of memory.*"
action: cancel_retries

When a log policy matches, its name will be displayed as a label in the WebUI, allowing for easy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When a log policy matches, its name will be displayed as a label in the WebUI, allowing for easy
When a log policy matches, its name appears as a label in the WebUI, making it easy

action: cancel_retries

When a log policy matches, its name will be displayed as a label in the WebUI, allowing for easy
identification of specific issues during a run. These labels will appear in both the run table and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
identification of specific issues during a run. These labels will appear in both the run table and
to identify specific issues during a run. These labels are shown in both the run table and


**New Features**

- Experiments: ``log_policies`` now have a ``name`` field. When a log policy matches, its name will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Experiments: ``log_policies`` now have a ``name`` field. When a log policy matches, its name will
- Experiments: Add a ``name`` field to ``log_policies``. When a log policy matches, its name

**New Features**

- Experiments: ``log_policies`` now have a ``name`` field. When a log policy matches, its name will
be displayed as a label in the WebUI, allowing for easy identification of specific issues during
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
be displayed as a label in the WebUI, allowing for easy identification of specific issues during
shows as a label in the WebUI, making it easy to spot specific issues during


- Experiments: ``log_policies`` now have a ``name`` field. When a log policy matches, its name will
be displayed as a label in the WebUI, allowing for easy identification of specific issues during
a run. These labels will appear in both the run table and run detail views.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a run. These labels will appear in both the run table and run detail views.
a run. Labels appear in both the run table and run detail views.

be displayed as a label in the WebUI, allowing for easy identification of specific issues during
a run. These labels will appear in both the run table and run detail views.

It has a new format. ``name`` is required, and ``action`` should be a plain string. For more
Copy link
Contributor

@tara-hpe tara-hpe Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It has a new format. ``name`` is required, and ``action`` should be a plain string. For more
In addition, there is a new format: ``name`` is required, and ``action`` is now a plain string. For more

Copy link
Contributor

@tara-hpe tara-hpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested edits

@determined-ci determined-ci requested a review from a team October 25, 2024 17:00
@jgongd jgongd merged commit c7e0fb5 into main Oct 25, 2024
79 of 91 checks passed
@jgongd jgongd deleted the jgong/doc/log-signal-release-note branch October 25, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants