docs: Describe arbitrary metadata logging

initial draft with placeholder text
determined-ai · Oct 17, 2024 · 6e18c55 · 6e18c55
1 parent d69f7cc
commit 6e18c55
Show file tree

Hide file tree

Showing 5 changed files with 183 additions and 2 deletions.
diff --git a/docs/reference/experiment-config-reference.rst b/docs/reference/experiment-config-reference.rst
@@ -1004,12 +1004,12 @@ Optional. The maximum number of trials that can be worked on simultaneously. The
 
 Optional. If specified, the weights of *every* trial in the search will be initialized to the most
 recent checkpoint of the given trial ID. This will fail if the source trial's model architecture is
-incompatible with the model architecture of any of the trials in this experiment.
+inconsistent with the model architecture of any of the trials in this experiment.
 
 ``source_checkpoint_uuid``
 --------------------------
 
-Optional. Like ``source_trial_id`` but specifies an arbitrary checkpoint from which to initialize
+Optional. Like ``source_trial_id``, but specifies an arbitrary checkpoint from which to initialize
 weights. At most one of ``source_trial_id`` or ``source_checkpoint_uuid`` should be set.
 
 Grid
@@ -1654,3 +1654,13 @@ If :ref:`gres_supported <cluster-configuration-slurm>` is set to ``false``, spec
 to ensure that ``slots_per_node`` GPUs will be available on the nodes selected for the job using
 other configurations such as targeting a specific resource pool with only ``slots_per_node`` GPU
 nodes or specifying a PBS constraint in the experiment configuration.
+
+******************
+ Metadata Logging
+******************
+
+Determined supports logging arbitrary metadata for experiments. This feature allows users to store
+additional context and information about their runs. To log metadata, use the following function in
+your code:
+
+TO DO: ADD EXAMPLE USAGE
diff --git a/docs/tools/webui-if.rst b/docs/tools/webui-if.rst
@@ -241,3 +241,26 @@ Clear the message with the following command:
    .. code:: bash
 
       det master cluster-message clear
+
+********************************
+ Viewing and Filtering Metadata
+********************************
+
+You can use the WebUI to view and filter experiments based on logged metadata. For a tutorial on how
+to log metadata, visit :ref:`metadata-logging-tutorial`.
+
+-  In the Overview tab of the experiment, you can filter and sort runs based on metadata values
+   using the filter menu.
+-  In the Runs (Table) view, metadata columns are displayed alongside other experiment information.
+-  On the Run details page, you'll find the "Metadata" section under the "Overview" tab, displaying
+   all logged metadata for that run.
+-  To download the metadata in JSON format, click the "Download" button.
+
+To filter runs based on metadata:
+
+#. In the Runs view, click on the filter icon.
+#. Select a metadata field from the dropdown menu.
+#. Choose a condition (is, is not, or contains) and enter a value.
+#. Click "Apply" to filter the runs based on the metadata.
+
+Note: Array-type metadata can be viewed but cannot be used for sorting or filtering.
diff --git a/docs/tutorials/_index.rst b/docs/tutorials/_index.rst
@@ -46,6 +46,7 @@ Examples let you build off of an existing model that already runs on Determined.
    :hidden:
 
    Quickstart for Model Developers <quickstart-mdldev>
+   Arbitrary Metadata Logging <metadata-logging>
    Porting Your PyTorch Model to Determined <pytorch-mnist-tutorial>
    Get Started with Detached Mode <detached-mode/_index>
    Viewing Epoch-Based Metrics in the WebUI <viewing-epoch-based-metrics>

diff --git a/docs/tutorials/metadata-logging.rst b/docs/tutorials/metadata-logging.rst
@@ -0,0 +1,137 @@
+.. _metadata-logging-tutorial:
+
+############################
+ Arbitrary Metadata Logging
+############################
+
+This tutorial demonstrates how to use Arbitrary Metadata Logging in Determined AI to log custom metadata for your experiments.
+
+**Why Use Arbitrary Metadata Logging?**
+
+Arbitrary Metadata Logging allows you to:
+
+- Capture experiment-specific information beyond standard metrics
+- Compare and analyze custom data across experiments
+- Filter and sort experiments based on custom metadata
+
+******************
+ Logging Metadata
+******************
+
+You can log metadata using the Determined Core API. Here's how to do it in your training code:
+
+1. Import the necessary module:
+
+   .. code:: python
+
+      from determined.core import Context
+
+2. In your trial class, add a method to log metadata:
+
+   .. code:: python
+
+      def log_metadata(self, context: Context):
+          context.train.report_metadata({
+              "dataset_version": "MNIST-v1.0",
+              "preprocessing": "normalization",
+              "hardware": {
+                  "gpu": "NVIDIA A100",
+                  "cpu": "Intel Xeon"
+              }
+          })
+
+3. Call this method in your training loop:
+
+   .. code:: python
+
+      def train_batch(self, batch: TorchData, epoch_idx: int, batch_idx: int):
+          # Existing training code...
+          
+          if batch_idx == 0:
+              self.log_metadata(self.context)
+
+          # Rest of the training code...
+
+This example logs metadata at the beginning of each epoch. Adjust the frequency based on your needs.
+
+*******************************
+ Viewing Metadata in the WebUI
+*******************************
+
+To view logged metadata:
+
+1. Open the WebUI and navigate to your experiment.
+2. Click on the trial you want to inspect.
+3. In the trial details page, find the "Metadata" section under the "Overview" tab.
+
+***********************************
+ Filtering and Sorting by Metadata
+***********************************
+
+The :ref:`Web UI <web-ui-if>` allows you to filter and sort experiments based on logged metadata:
+
+1. Navigate to the Experiments List page in the WebUI.
+2. Click on the filter icon.
+3. Select a metadata field from the dropdown menu.
+4. Choose a condition (is, is not, or contains) and enter a value.
+5. Click "Apply" to filter the experiments based on the metadata.
+
+For more detailed instructions on filtering and sorting, refer to the WebUI guide:
+
+Performance Considerations
+==========================
+
+When using Arbitrary Metadata Logging, consider the following:
+
+- Metadata is stored efficiently for fast retrieval and filtering.
+- Avoid logging very large metadata objects, as this may impact performance.
+- Use consistent naming conventions for keys to make filtering and sorting easier.
+- For deeply nested JSON structures, filtering and sorting are supported at the top level.
+
+Example Use Case
+================
+
+Let's say you're running experiments to benchmark different hardware setups. For each run, you might log:
+
+.. code:: python
+
+   def log_hardware_metadata(self, context: Context):
+       context.train.report_metadata({
+           "hardware": {
+               "gpu": "NVIDIA A100",
+               "cpu": "Intel Xeon",
+               "ram": "64GB"
+           },
+           "software": {
+               "cuda_version": "11.2",
+               "python_version": "3.8.10"
+           },
+           "runtime_seconds": 3600
+       })
+
+You can then use these logged metadata fields to:
+
+1. Filter for experiments that ran on a specific GPU model.
+2. Compare runtimes across different hardware configurations.
+3. Analyze the impact of software versions on performance.
+
+Summary
+=======
+
+Arbitrary Metadata Logging enhances your experiment tracking capabilities by allowing you to:
+
+1. Log custom metadata specific to your experiments.
+2. View logged metadata in the WebUI for each trial.
+3. Filter and sort experiments based on custom metadata.
+4. Compare and analyze experiments using custom metadata fields.
+
+By leveraging this feature, you can capture and analyze experiment-specific information beyond standard metrics, leading to more insightful comparisons and better experiment management within the Determined AI platform.
+
+Next Steps
+==========
+
+- Experiment with logging different types of metadata in your trials.
+- Use the filtering and sorting capabilities in the WebUI to analyze your experiments.
+- Integrate metadata logging into your existing Determined AI workflows to enhance your experiment tracking.
+
+For more tutorials and guides, visit the :ref:`tutorials-index`.
diff --git a/docs/tutorials/quickstart-mdldev.rst b/docs/tutorials/quickstart-mdldev.rst
@@ -352,6 +352,16 @@ This example uses a fixed batch size and searches on dropout size, filters, and
    one trial performing at about 98 percent validation accuracy. The hyperparameter search halts
    poorly performing trials.
 
+*************************
+ Logging Custom Metadata
+*************************
+
+Determined also supports logging custom metadata during a trial run. This feature allows you to
+capture additional context and information about your experiments beyond standard metrics.
+
+To learn more about how to use metadata logging in your experiments, please refer to the
+:ref:`metadata-logging-tutorial`.
+
 ************
  Learn More
 ************