diff --git a/docs/source/brain.rst b/docs/source/brain.rst index a251f4f25f..d8a72fa636 100644 --- a/docs/source/brain.rst +++ b/docs/source/brain.rst @@ -38,6 +38,26 @@ workflow: you can easily query and sort your datasets to find similar examples, both programmatically and via point-and-click in the App. +* :ref:`Leaky splits `: + Often when sourcing data en masse, duplicates and near duplicates can slip + through the cracks. The FiftyOne Brain offers a *leaky splits analysis* that + can be used to find potential leaks between dataset splits. Such leaks can + be misleading when evaluating a model, giving an overly optimistic measure + for the quality of training. + +* :ref:`Near duplicates `: + When curating massive datasets, you may inadvertently add near duplicate data + to your datasets, which can bias or otherwise confuse your models. The + FiftyOne Brain offers a *near duplicate detection* algorithm that + automatically surfaces such data quality issues and prompts you to take + action to resolve them. + +* :ref:`Exact duplicates `: + Despite your best efforts, you may accidentally add duplicate data to a + dataset. The FiftyOne Brain provides an *exact duplicate detection* method + that scans your data and alerts you if a dataset contains duplicate samples, + either under the same or different filenames. + * :ref:`Uniqueness `: During the training loop for a model, the best results will be seen when training on unique data. The FiftyOne Brain provides a @@ -74,17 +94,10 @@ workflow: examples to train on in your data and for visualizing common modes of the data. -* :ref:`Leaky Splits `: - Often when sourcing data en masse, duplicates and near duplicates can slip - through the cracks. The FiftyOne Brain offers a *leaky-splits analysis* that - can be used to find potential leaks between dataset splits. These splits can - be misleading when evaluating a model, giving an overly optimistic measure - for the quality of training. - .. note:: Check out the :ref:`tutorials page ` for detailed examples - demonstrating the use of each Brain capability. + demonstrating the use of many Brain capabilities. .. _brain-embeddings-visualization: @@ -765,22 +778,6 @@ along with the `brain_key` of a compatible similarity index: :class:`PromptMixin ` interface can support text similarity queries! -.. _brain-similarity-duplicates: - -Duplicate detection -------------------- - -For some :ref:`similarity backends ` --- including -the default sklearn backend --- the |SimilarityIndex| object returned by -:meth:`compute_similarity() ` also provides -powerful -:meth:`find_unique() ` -and -:meth:`find_duplicates() ` -methods that you can use to find both maximally unique and near-duplicate -subsets of your datasets or their object patches. See -:ref:`this section ` for example uses. - .. _brain-similarity-api: Similarity API @@ -1268,149 +1265,260 @@ more detail or underrepresented classes that need more training examples. Here are a few of the many possible applications: +- Pruning :ref:`near-duplicate images ` from your + training dataset - Identifying failure patterns of a model - Finding examples of target scenarios in your data lake - Mining hard examples for your evaluation pipeline - Recommending samples from your data lake for classes that need additional training data -- Pruning near-duplicate images from your training dataset - -.. _brain-similarity-cifar10: - -CIFAR-10 example ----------------- -The following example demonstrates two common workflows that you can perform -using a similarity index generated via -:meth:`compute_similarity() ` on the -:ref:`CIFAR-10 dataset `: +.. _brain-leaky-splits: -- Selecting a set of maximally unique images from the dataset -- Identifying near-duplicate images in the dataset +Leaky splits +____________ -.. warning:: +Despite our best efforts, duplicates and other forms of non-IID samples +show up in our data. When these samples end up in different splits, this +can have consequences when evaluating a model. It can often be easy to +overestimate model capability due to this issue. The FiftyOne Brain offers a +way to identify such cases in dataset splits. - This workflow is only supported by the default `sklearn` backend. +The leaks of a dataset can be computed directly without the need for the +predictions of a pre-trained model via the +:meth:`compute_leaky_splits() ` method: .. code-block:: python :linenos: import fiftyone as fo - import fiftyone.zoo as foz + import fiftyone.brain as fob - dataset = foz.load_zoo_dataset("cifar10", split="test") - print(dataset) + dataset = fo.load_dataset(...) -To proceed, we first need some suitable image embeddings for the dataset. -Although the :meth:`compute_similarity() ` -and :meth:`compute_visualization() ` -methods are equipped with a default general-purpose model to generate -embeddings if none are provided, you'll typically find higher-quality insights -when a domain-specific model is used to generate embeddings. + # Splits defined via tags + split_tags = ["train", "test"] + index = fob.compute_leaky_splits(dataset, splits=split_tags) + leaks = index.leaks_view() -In this case, we'll use a classifier that has been fine-tuned on CIFAR-10 to -compute some embeddings and then generate image similarity/visualization -indexes for them: + # Splits defined via field + split_field = "split" # holds split values e.g. 'train' or 'test' + index = fob.compute_leaky_splits(dataset, splits=split_field) + leaks = index.leaks_view() + + # Splits defined via views + split_views = {"train": train_view, "test": test_view} + index = fob.compute_leaky_splits(dataset, splits=split_views) + leaks = index.leaks_view() + +Notice how the splits of the dataset can be defined in three ways: through +sample tags, through a string field that assigns each split a unique value in +the field, or by directly providing views that define the splits. + +**Input**: A |Dataset| or |DatasetView|, and a definition of splits through one +of tags, a field, or views. + +**Output**: An index that will allow you to look through your leaks with +:meth:`leaks_view() ` +and also provides some useful actions once they are discovered such as +automatically cleaning the dataset with +:meth:`no_leaks_view() ` +or tagging the leaks for the future action with +:meth:`tag_leaks() `. + +**What to expect**: Leaky splits works by embedding samples with a powerful +model and finding very close samples in different splits in this space. Large, +powerful models that were *not* trained on a dataset can provide insight into +visual and semantic similarity between images, without creating further leaks +in the process. + +**Similarity index**: Under the hood, leaky splits leverages the brain's +:class:`SimilarityIndex ` to detect +leaks. Any :ref:`similarity backend ` that +implements the +:class:`DuplicatesMixin ` can be +used to compute leaky splits. You can either pass an existing similarity index +by passing its brain key to the argument `similarity_index`, or have the +method create one on the fly for you. + +**Embeddings**: You can customize the model used to compute embeddings via the +`model` argument of +:meth:`compute_leaky_splits() `. You can +also precompute embeddings and tell leaky splits to use them by passing them +via the `embeddings` argument. + +**Thresholds**: Leaky splits uses a threshold to decide what samples are +too close and thus mark them as potential leaks. This threshold can be +customized either by passing a value to the `threshold` argument of +:meth:`compute_leaky_splits() `. The best +value for your use case may vary depending on your dataset, as well as the +embeddings used. A threshold that's too big may have a lot of false positives, +while a threshold that's too small may have a lot of false negatives. + +The example code below runs leaky splits analysis on the +`COCO dataset `_. Try it for yourself and see +what you find! .. code-block:: python :linenos: + import fiftyone as fo import fiftyone.brain as fob - import fiftyone.brain.internal.models as fbm - - # Compute embeddings via a pre-trained CIFAR-10 classifier - model = fbm.load_model("simple-resnet-cifar10") - embeddings = dataset.compute_embeddings(model, batch_size=16) + import fiftyone.zoo as foz + import fiftyone.utils.random as four - # Generate similarity index - results = fob.compute_similarity( - dataset, embeddings=embeddings, brain_key="img_sim" - ) + # Load some COCO data + dataset = foz.load_zoo_dataset("coco-2017", split="test") - # Generate a 2D visualization - viz_results = fob.compute_visualization( - dataset, embeddings=embeddings, brain_key="img_viz" - ) + # Set up splits via tags + dataset.untag_samples(dataset.distinct("tags")) + four.random_split(dataset, {"train": 0.7, "test": 0.3}) -Finding maximally unique images -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + # Find leaks + index = fob.compute_leaky_splits(dataset, splits=["train", "test"]) + leaks = index.leaks_view() -With a similarity index generated, we can use the -:meth:`find_unique() ` -method of the index to identify a set of images of any desired size that are -maximally unique with respect to each other: +The +:meth:`leaks_view() ` +method returns a view that contains only the leaks in the input splits. Once +you have these leaks, it is wise to look through them. You may gain some +insight into the source of the leaks: .. code-block:: python :linenos: - # Use the similarity index to identify 500 maximally unique images - results.find_unique(500) - print(results.unique_ids[:5]) + session = fo.launch_app(leaks) -We can also conveniently visualize the results of this operation via the -:meth:`visualize_unique() ` -method of the results object, which generates a scatterplot with the unique -images colored separately: +Before evaluating your model on your test set, consider getting a version of it +with the leaks removed. This can be easily done via +:meth:`no_leaks_view() `: .. code-block:: python :linenos: - # Visualize the unique images in embeddings space - plot = results.visualize_unique(visualization=viz_results) - plot.show(height=800, yaxis_scaleanchor="x") + # The original test split + test_set = index.split_views["test"] -.. image:: /images/brain/brain-cifar10-unique-viz.png - :alt: cifar10-unique-viz + # The test set with leaks removed + test_set_no_leaks = index.no_leaks_view(test_set) + + session.view = test_set_no_leaks + +Performance on the clean test set will can be closer to the performance of the +model in the wild. If you found some leaks in your dataset, consider comparing +performance on the base test set against the clean test set. + +.. image:: /images/brain/brain-leaky-splits.png + :alt: leaky-splits :align: center -And of course we can load a view containing the unique images in the App to -explore the results in detail: +.. _brain-near-duplicates: + +Near duplicates +_______________ + +When curating massive datasets, you may inadvertently add near duplicate data +to your datasets, which can bias or otherwise confuse your models. + +The :meth:`compute_near_duplicates() ` +method leverages embeddings to automatically surface near-duplicate samples in +your dataset: .. code-block:: python :linenos: - # Visualize the unique images in the App - unique_view = dataset.select(results.unique_ids) - session = fo.launch_app(view=unique_view) + import fiftyone as fo + import fiftyone.brain as fob -.. image:: /images/brain/brain-cifar10-unique-view.png - :alt: cifar10-unique-view - :align: center + dataset = fo.load_dataset(...) -Finding near-duplicate images -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + index = fob.compute_near_duplicates(dataset) + print(index.duplicate_ids) -We can also use our similarity index to detect *near-duplicate* images in the -dataset. + dups_view = index.duplicates_view() + session = fo.launch_app(dups_view) + +**Input**: An unlabeled (or labeled) dataset. There are +:ref:`recipes ` for building datasets from a wide variety of image +formats, ranging from a simple directory of images to complicated dataset +structures like `COCO `_. -For example, let's use the -:meth:`find_duplicates() ` -method to identify the least similar images in our dataset: +**Output**: A |SimilarityIndex| object that provides powerful methods such as +:meth:`duplicate_ids `, +:meth:`neighbors_map ` +and +:meth:`duplicates_view() ` +to analyze potential near duplicates as demonstrated below + +**What to expect**: Near duplicates analysis leverages embeddings to identify +samples that are too close to their nearest neighbors. You can provide +pre-computed embeddings, specify a :ref:`zoo model ` of your choice +to use to compute embeddings, or provide nothing and rely on the method's +default model to generate embeddings. + +**Thresholds**: When using custom embeddings/models, you may need to adjust the +distance threshold used to detect potential duplicates. You can do this by +passing a value to the `threshold` argument of +:meth:`compute_near_duplicates() `. The +best value for your use case may vary depending on your dataset, as well as the +embeddings used. A threshold that's too big may have a lot of false positives, +while a threshold that's too small may have a lot of false negatives. + +The following example demonstrates how to use +:meth:`compute_near_duplicates() ` to +detect near duplicate images on the +:ref:`CIFAR-10 dataset `: .. code-block:: python :linenos: - # Use the similarity index to identify the 1% of images that are least - # similar w.r.t. the other images - results.find_duplicates(fraction=0.01) + import fiftyone as fo + import fiftyone.zoo as foz - print(results.neighbors_map) + dataset = foz.load_zoo_dataset("cifar10", split="test") -.. note:: +To proceed, we first need some suitable image embeddings for the dataset. +Although the :meth:`compute_near_duplicates() ` +method is equipped with a default general-purpose model to generate embeddings +if none are provided, you'll typically find higher-quality insights when a +domain-specific model is used to generate embeddings. + +In this case, we'll use a classifier that has been fine-tuned on CIFAR-10 to +pre-compute embeddings and them feed them to +:meth:`compute_near_duplicates() `: + +.. code-block:: python + :linenos: + + import fiftyone.brain as fob + import fiftyone.brain.internal.models as fbm + + # Compute embeddings via a pre-trained CIFAR-10 classifier + model = fbm.load_model("simple-resnet-cifar10") + embeddings = dataset.compute_embeddings(model, batch_size=16) + + # Scan for near-duplicates + index = fob.compute_near_duplicates( + dataset, + embeddings=embeddings, + thresh=0.02, + ) - You can also provide a specific embeddings distance threshold to - :meth:`find_duplicates() `, - in which case the non-duplicate set will be the (approximately) largest set - such that all pairwise distances between non-duplicate images are - *greater* than this threshold. +Finding near-duplicate samples +------------------------------ The :meth:`neighbors_map ` -property of the results object provides a data structure that summarizes the -findings. The keys of the dictionary are the sample IDs of each nearest -non-duplicate image, and the values are lists of `(id, distance)` tuples -listing the sample IDs of the duplicate images for each in-sample image -together with the embedding distance between the two images: +property of the index provides a data structure that summarizes the findings. +The keys of the dictionary are the sample IDs of each non-duplicate sample, and +the values are lists of `(id, distance)` tuples listing the sample IDs of the +duplicate samples for each reference sample together with the embedding +distance between the two samples: + +.. code-block:: python + :linenos: + + print(index.neighbors_map) .. code-block:: text @@ -1433,26 +1541,118 @@ together with the embedding distance between the two images: We can conveniently visualize this information in the App via the :meth:`duplicates_view() ` -method of the results object, which constructs a view with the duplicate images -arranged directly after their corresponding nearest in-sample image, with -additional sample fields recording the type and nearest in-sample ID/distance -for each image: +method of the index, which constructs a view with the duplicate samples +arranged directly after their corresponding reference sample, with optional +additional fields recording the type and nearest reference sample ID/distance: .. code-block:: python :linenos: - duplicates_view = results.duplicates_view( + duplicates_view = index.duplicates_view( type_field="dup_type", id_field="dup_id", dist_field="dup_dist", ) - session.view = duplicates_view + session = fo.launch_app(duplicates_view) .. image:: /images/brain/brain-cifar10-duplicate-view.png :alt: cifar10-duplicate-view :align: center +.. note:: + + You can also use the + :meth:`find_duplicates() ` + method of the index to rerun the duplicate detection with a different + `threshold` without calling + :meth:`compute_near_duplicates() ` + again. + +Finding maximally unique samples +-------------------------------- + +You can also use the +:meth:`find_unique() ` +method of the index to identify a set of samples of any desired size that are +maximally unique with respect to each other: + +.. code-block:: python + :linenos: + + # Use the similarity index to identify 500 maximally unique samples + index.find_unique(500) + print(index.unique_ids[:5]) + +We can also conveniently visualize the results of this operation via the +:meth:`visualize_unique() ` +method of the index, which generates a scatterplot with the unique samples +colored separately: + +.. code-block:: python + :linenos: + + # Generate a 2D visualization + viz_results = fob.compute_visualization(dataset, embeddings=embeddings) + + # Visualize the unique samples in embeddings space + plot = index.visualize_unique(viz_results) + plot.show(height=800, yaxis_scaleanchor="x") + +.. image:: /images/brain/brain-cifar10-unique-viz.png + :alt: cifar10-unique-viz + :align: center + +And of course we can load a view containing the unique samples in the App to +explore the results in detail: + +.. code-block:: python + :linenos: + + # Visualize the unique images in the App + unique_view = dataset.select(index.unique_ids) + session = fo.launch_app(view=unique_view) + +.. image:: /images/brain/brain-cifar10-unique-view.png + :alt: cifar10-unique-view + :align: center + +.. _brain-exact-duplicates: + +Exact duplicates +________________ + +Despite your best efforts, you may accidentally add duplicate data to a +dataset. Left unmitigated, such quality issues can bias your models and +confound your analysis. + +The :meth:`compute_exact_duplicates() ` +method scans your dataset and determines if you have duplicate data either +under the same or different filenames: + +.. code-block:: python + :linenos: + + import fiftyone as fo + import fiftyone.brain as fob + + dataset = fo.load_dataset(...) + + duplicates_map = fob.compute_exact_duplicates(dataset) + print(duplicates_map) + +**Input**: An unlabeled (or labeled) dataset. There are +:ref:`recipes ` for building datasets from a wide variety of image +formats, ranging from a simple directory of images to complicated dataset +structures like `COCO `_. + +**Output**: A dictionary mapping IDs of samples with exact duplicates to lists +of IDs of the duplicates for the corresponding sample + +**What to expect**: Exact duplicates analysis uses filehases to identify +duplicate data, regardless of whether they are stored under the same or +different filepaths in your dataset. + .. _brain-image-uniqueness: Image uniqueness @@ -1762,149 +1962,17 @@ similar looking groups of samples. The representativeness is then computed based on each sample's proximity to the computed cluster centers, farther samples being less representative and closer samples being more representative. -.. image:: /images/brain/brain-representativeness.png - :alt: representativeness - :align: center - - -.. _brain-image-leaky-splits: - -Leaky Splits -____________ - -Despite our best efforts, duplicates and other forms of non-IID samples -show up in our data. When these samples end up in different splits, this -can have consequences when evaluating a model. It can often be easy to -overestimate model capability due to this issue. The FiftyOne Brain offers a way -to identify such cases in dataset splits. - -The leaks of a |Dataset| or |DatasetView| can be computed directly without the need -for the predictions of a pre-trained model via the -:meth:`compute_leaky_splits() ` -method:. The splits of a dataset can be defined in three ways. Through tags, by -tagging samples with their corresponding split. Through a field, by giving each -split a unique value in that field. And finally through views, by having views -corresponding to each split. - -.. code-block:: python - :linenos: - - import fiftyone as fo - import fiftyone.brain as fob - - dataset = fo.load_dataset(...) - - # splits via tags - split_tags = ['train', 'test'] - index, leaks = fob.compute_leaky_splits(dataset, split_tags=split_tags) - - # splits via field - split_field = ['split'] # holds split values e.g. 'train' or 'test' - index, leaks = fob.compute_leaky_splits(dataset, split_field=split_field) - - # splits via views - split_views = { - 'train' : some_view - 'test' : some_other_view - } - index, leaks = fob.compute_leaky_splits(dataset, split_views=split_views) - -Here is a sample snippet to run this on the `COCO `_ dataset. -Try it for yourself and see what you may find. - -.. code-block:: python - :linenos: - - import fiftyone as fo - import fiftyone.zoo as foz - import fiftyone.utils.random as four - from fiftyone.brain import compute_leaky_splits - - # load coco - dataset = foz.load_zoo_dataset("coco-2017", split="test") - - # set up splits via tags - dataset.untag_samples(dataset.distinct("tags")) - four.random_split(dataset, {"train": 0.7, "test": 0.3}) - - # compute leaks - index, leaks = compute_leaky_splits(dataset, split_tags=['train', 'test']) - -Once you have these leaks, it is wise to look through them. You may gain some insight -into the source of the leaks. - -.. code-block:: python - :linenos: - - session = fo.launch_app(leaks) - -Before evaluating your model on your test set, consider getting a version of it -with the leaks removed. This can be easily done with the built in method -:meth:`no_leaks_view() `. - -.. code-block:: python - :linenos: - - # if you already have it - test_set = some_view - - # can also be found with the variable `split_views` from the index - # make sure to put in the right string based on the field/tag/key in view dict - # passed when building the index - test_set = index.split_views['test'] - - test_set_no_leaks = index.no_leaks_view(test_set) # return a view with leaks removed - session.view = test_set_no_leaks - - # do evaluations on test_set_no_leaks rather than test_set - -Performance on the clean test set will can be closer to the performance of the -model in the wild. If you found some leaks in your dataset, consider comparing -performance on the base test set against the clean test set. - -**Input**: A |Dataset| or |DatasetView|, and a definition of splits through one -of tags, a field, or views. - -**Output**: An index that will allow you to look through your leaks and -provides some useful actions once they are discovered such as automatically -cleaning the dataset with -:meth:`no_leaks_view() ` -or tagging them for the future with -:meth:`tag_leaks() `. -Besides this, a view with all leaks is returned. Visualization of this view -can give you an insight into the source of the leaks in your dataset. - -**What to expect**: Leakiness find leaks by embedding samples with a powerful -model and finding very close samples in different splits in this space. Large, -powerful models that were *not* trained on a dataset can provide insight into -visual and semantic similarity between images, without creating further leaks -in the process. +.. note:: -**Similarity**: At its core, the leaky-splits module is a wrapper for the brain's -:class:`SimilarityIndex `. Any similarity -backend, (see :ref:`similarity backends `) that implements -the :class:`DuplicatesMixin ` can be used -to compute leaky splits. You can either pass an existing similarity index by passing -its brain key to the argument `similarity_brain_key`, or have the method create one on -the fly for you. If there is a specific configuration for `Similarity` you would like -to use, pass it in the argument `similarity_config_dict`. - -**Models and Embeddings**: If you opt for the method to create a `SimilarityIndex` -for you, you can still bring you own model by passing it in the `model` argument. -Alternatively, compute embeddings and pass the field that they reside on. We will -handle the rest. - -**Thresholds**: The leaky-splits module uses a threshold to decide what samples -are 'too close' and mark them as potential leaks. This threshold can be changed -either by passing a value to the `threshold` argument of the `compute_leaky_splits()` -method, or by using the -:meth:`set_threshold() ` -method. The best value for your use-case may vary depending on your dataset, as well -as the embeddings used. A threshold that's too big will have a lot of false positives, -a threshold that's too small will have a lot of false negatives. + Did you know? You can specify a region of interest within each image to use + to compute representativeness by providing the optional `roi_field` + argument to + :meth:`compute_representativeness() `, + which contains |Detections| or |Polylines| that define the ROI for each + sample. -.. image:: /images/brain/brain-leaky-splits.png - :alt: leaky-splits +.. image:: /images/brain/brain-representativeness.png + :alt: representativeness :align: center .. _brain-managing-runs: diff --git a/docs/source/dataset_zoo/index.rst b/docs/source/dataset_zoo/index.rst index fa8fb2ac52..3e97ae3eed 100644 --- a/docs/source/dataset_zoo/index.rst +++ b/docs/source/dataset_zoo/index.rst @@ -22,8 +22,8 @@ load into FiftyOne with a single command. :button_text: Explore the datasets in the zoo :button_link: datasets.html -Remotely-sourced datasets __SUB_NEW__ -------------------------------------- +Remotely-sourced datasets +------------------------- The Dataset Zoo also supports loading datasets whose download/preparation methods are provided via GitHub repositories or URLs. diff --git a/docs/source/images/app/app-granular.gif b/docs/source/images/app/app-query-performance-disabled.gif similarity index 100% rename from docs/source/images/app/app-granular.gif rename to docs/source/images/app/app-query-performance-disabled.gif diff --git a/docs/source/images/app/app-query-performance-mode.gif b/docs/source/images/app/app-query-performance.gif similarity index 100% rename from docs/source/images/app/app-query-performance-mode.gif rename to docs/source/images/app/app-query-performance.gif diff --git a/docs/source/images/app/model-evaluation-class.gif b/docs/source/images/app/model-evaluation-class.gif new file mode 100644 index 0000000000..2e06af4a60 Binary files /dev/null and b/docs/source/images/app/model-evaluation-class.gif differ diff --git a/docs/source/images/app/model-evaluation-compare.gif b/docs/source/images/app/model-evaluation-compare.gif new file mode 100644 index 0000000000..60ae819db6 Binary files /dev/null and b/docs/source/images/app/model-evaluation-compare.gif differ diff --git a/docs/source/images/app/model-evaluation-confusion.gif b/docs/source/images/app/model-evaluation-confusion.gif new file mode 100644 index 0000000000..5f036c9fdb Binary files /dev/null and b/docs/source/images/app/model-evaluation-confusion.gif differ diff --git a/docs/source/images/app/model-evaluation-metric.gif b/docs/source/images/app/model-evaluation-metric.gif new file mode 100644 index 0000000000..8eeb857062 Binary files /dev/null and b/docs/source/images/app/model-evaluation-metric.gif differ diff --git a/docs/source/images/app/model-evaluation-notes.gif b/docs/source/images/app/model-evaluation-notes.gif new file mode 100644 index 0000000000..2361af0ff1 Binary files /dev/null and b/docs/source/images/app/model-evaluation-notes.gif differ diff --git a/docs/source/images/app/model-evaluation-open.gif b/docs/source/images/app/model-evaluation-open.gif new file mode 100644 index 0000000000..578b2a6f2e Binary files /dev/null and b/docs/source/images/app/model-evaluation-open.gif differ diff --git a/docs/source/images/app/model-evaluation-review.gif b/docs/source/images/app/model-evaluation-review.gif new file mode 100644 index 0000000000..cf71a3eaf1 Binary files /dev/null and b/docs/source/images/app/model-evaluation-review.gif differ diff --git a/docs/source/images/app/model-evaluation-summary.gif b/docs/source/images/app/model-evaluation-summary.gif new file mode 100644 index 0000000000..5340e3a7bd Binary files /dev/null and b/docs/source/images/app/model-evaluation-summary.gif differ diff --git a/docs/source/images/teams/data_quality_brightness_analysis.png b/docs/source/images/teams/data_quality_brightness_analysis.png new file mode 100644 index 0000000000..a074640e13 Binary files /dev/null and b/docs/source/images/teams/data_quality_brightness_analysis.png differ diff --git a/docs/source/images/teams/data_quality_brightness_mark_as_reviewed.png b/docs/source/images/teams/data_quality_brightness_mark_as_reviewed.png new file mode 100644 index 0000000000..c372b34e9a Binary files /dev/null and b/docs/source/images/teams/data_quality_brightness_mark_as_reviewed.png differ diff --git a/docs/source/images/teams/data_quality_brightness_scan.png b/docs/source/images/teams/data_quality_brightness_scan.png new file mode 100644 index 0000000000..58469ab398 Binary files /dev/null and b/docs/source/images/teams/data_quality_brightness_scan.png differ diff --git a/docs/source/images/teams/data_quality_brightness_scan_options.png b/docs/source/images/teams/data_quality_brightness_scan_options.png new file mode 100644 index 0000000000..066e5e0436 Binary files /dev/null and b/docs/source/images/teams/data_quality_brightness_scan_options.png differ diff --git a/docs/source/images/teams/data_quality_brightness_scheduled.png b/docs/source/images/teams/data_quality_brightness_scheduled.png new file mode 100644 index 0000000000..2d6ae863a7 Binary files /dev/null and b/docs/source/images/teams/data_quality_brightness_scheduled.png differ diff --git a/docs/source/images/teams/data_quality_brightness_slider.gif b/docs/source/images/teams/data_quality_brightness_slider.gif new file mode 100644 index 0000000000..bf8f97db82 Binary files /dev/null and b/docs/source/images/teams/data_quality_brightness_slider.gif differ diff --git a/docs/source/images/teams/data_quality_brightness_tag.png b/docs/source/images/teams/data_quality_brightness_tag.png new file mode 100644 index 0000000000..c04a8805a5 Binary files /dev/null and b/docs/source/images/teams/data_quality_brightness_tag.png differ diff --git a/docs/source/images/teams/data_quality_home.png b/docs/source/images/teams/data_quality_home.png new file mode 100644 index 0000000000..f4ebba3539 Binary files /dev/null and b/docs/source/images/teams/data_quality_home.png differ diff --git a/docs/source/images/teams/data_quality_new_samples_home.png b/docs/source/images/teams/data_quality_new_samples_home.png new file mode 100644 index 0000000000..9b92afc11b Binary files /dev/null and b/docs/source/images/teams/data_quality_new_samples_home.png differ diff --git a/docs/source/images/teams/data_quality_new_samples_modal.png b/docs/source/images/teams/data_quality_new_samples_modal.png new file mode 100644 index 0000000000..773e1874e6 Binary files /dev/null and b/docs/source/images/teams/data_quality_new_samples_modal.png differ diff --git a/docs/source/images/teams/qp_toggle.png b/docs/source/images/teams/qp_toggle.png deleted file mode 100644 index eb49d8bb86..0000000000 Binary files a/docs/source/images/teams/qp_toggle.png and /dev/null differ diff --git a/docs/source/index.rst b/docs/source/index.rst index 962a5c63ff..d1f39171df 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -487,8 +487,8 @@ us at support@voxel51.com. Recipes Cheat Sheets User Guide - Dataset Zoo __SUB_NEW__ - Model Zoo __SUB_NEW__ + Dataset Zoo + Model Zoo FiftyOne Brain Integrations Plugins diff --git a/docs/source/model_zoo/index.rst b/docs/source/model_zoo/index.rst index 893a2813c8..6510a75130 100644 --- a/docs/source/model_zoo/index.rst +++ b/docs/source/model_zoo/index.rst @@ -43,8 +43,8 @@ you can apply to your datasets with a few simple commands. :meth:`apply_model() ` and :meth:`compute_embeddings() `! -Remotely-sourced models __SUB_NEW__ ------------------------------------ +Remotely-sourced models +----------------------- The Model Zoo also supports downloading and applying models whose definitions are provided via GitHub repositories or URLs. diff --git a/docs/source/plugins/developing_plugins.rst b/docs/source/plugins/developing_plugins.rst index 8faf447c3c..286ff26265 100644 --- a/docs/source/plugins/developing_plugins.rst +++ b/docs/source/plugins/developing_plugins.rst @@ -2382,7 +2382,7 @@ loaded only when the `brain_key` property is modified. Panel data is never readable in Python; it is only implicitly used by the types you define when they are rendered clientside. -.. _panel-execution-store +.. _panel-execution-store: Execution store --------------- diff --git a/docs/source/release-notes.rst b/docs/source/release-notes.rst index 00c0b72cd4..17aa4f7a62 100644 --- a/docs/source/release-notes.rst +++ b/docs/source/release-notes.rst @@ -3,17 +3,147 @@ FiftyOne Release Notes .. default-role:: code +FiftyOne Teams 2.2.0 +-------------------- +*Released December 4, 2024* + +Includes all updates from :ref:`FiftyOne 1.1.0 `, plus: + +- All Teams deployments now have builtin compute capacity for + executing :ref:`delegated operations ` in the + background while you work in the App +- Introduced :ref:`Data Lens `, which allows you to explore and + import samples from external data sources into FiftyOne +- Added a :ref:`Data Quality Panel ` that automatically scans + your data for quality issues and helps you take action to resolve them +- Added a :ref:`Query Performance Panel ` that helps you + create the necessary indexes to optimize queries on large datasets +- Added support for creating embeddings visualizations natively from the + :ref:`Embeddings panel ` +- Added support for evaluating models natively from the + :ref:`Modal Evaluation panel ` +- Added support for :ref:`configuring an SMTP server ` for + sending user invitiations via email when running in + :ref:`Internal Mode ` + +.. _release-notes-v1.1.0: + +FiftyOne 1.1.0 +-------------- +*Released December 4, 2024* + +What's New + +- Added a :ref:`Model Evaluation panel ` for + visually and interactively evaluating models in the FiftyOne App +- Introduced :ref:`Query Performance ` in the + App, which automatically nudges you to create the necessary indexes to + greatly optimize queries on large datasets +- Added a :ref:`leaky splits method ` for automatically + detecting near-duplicate samples in different splits of your datasets +- Added a :ref:`near duplicates method ` that scans + your datasets and detects potential duplicate samples + +App + +- Added zoom-to-crop and set-look-at for selected labels in the + :ref:`3D visualizer ` + `#4931 `_ +- Gracefully handle deleted + recreated datasets of the same name + `#5183 `_ +- Fixed a bug that prevented video playback from working for videos with + unknown frame rate + `#5155 `_ + +SDK + +- Added :meth:`min() ` and + :meth:`max() ` and + aggregations + `#5029 `_ +- Improved support for creating summary fields and indexes + `#5091 `_ +- Added support for creating compound indexes when using the builtin + :class:`create_index ` operator that + optimize sidebar queries for group datasets + `#5174 `_ +- The builtin + :class:`clear_sample_field ` + and + :class:`clear_frame_field ` + operators now support clearing fields of views, in addition to full datasets + `#5122 `_ +- Fixed a bug that prevented users with `pydantic` installed from loading the + :ref:`quickstart-3d dataset ` from the zoo + `#4994 `_ + +Brain + +- Added support for passing existing + :ref:`similarity indexes ` to + :func:`compute_visualization() `, + :func:`compute_uniqueness() `, and + :func:`compute_representativeness() ` + `#201 `_, + `#204 `_ +- Upgraded the :ref:`Pinecone integration ` to support + `pinecone-client>=3.2` + `#202 `_ + +Plugins + +- Added an :ref:`Execution Store ` that provides a + key-value interface for persisting data beyond the lifetime of a panel + `#4827 `_, + `#5144 `_ +- Added + :meth:`ctx.spaces ` + and + :meth:`set_spaces() ` + to the operator execution context + `#4902 `_ +- Added + :meth:`open_sample() ` + and + :meth:`close_sample() ` + methods for programmatically controlling what sample(s) are displayed in the + App's sample modal + `#5168 `_ +- Added a `skip_prompt` option to + :meth:`ctx.prompt `, + allowing users to bypass prompts during operation execution + `#4992 `_ +- Introduced a new + :class:`StatusButtonView ` type + for rendering buttons with status indicators + `#5105 `_ +- Added support for giving + :class:`ImageView ` components click + targets + `#4996 `_ +- Added an :ref:`allow_legacy_orchestrators ` config flag + to enable running delegated operations + :ref:`locally ` + `#5176 `_ +- Fixed a bug when running delegated operations + :ref:`programmatically ` + `#5180 `_ +- Fixed a bug when running delegated operations with output schemas on + MongoDB `_ + + FiftyOne Teams 2.1.3 -------------------- -*Released November XX, 2024* +*Released November 8, 2024* -Includes all updates from :ref:`FiftyOne 1.0.2 ` +Includes all updates from :ref:`FiftyOne 1.0.2 `. -.. _release-notes-v1.0.3: +.. _release-notes-v1.0.2: FiftyOne 1.0.2 -------------- -*Released November XX, 2024* +*Released November 8, 2024* Zoo @@ -52,6 +182,7 @@ App - Fixed batch selection with ctrl + click in the grid `#5046 `_ + FiftyOne Teams 2.1.2 -------------------- *Released October 31, 2024* @@ -4295,7 +4426,7 @@ Annotation Docs - Added a :doc:`CVAT annotation tutorial ` -- Added a :ref:`new example ` to the brain user guide +- Added a :ref:`new example ` to the brain user guide that demonstrates unique and near-duplicate image workflows - Added an object embeddings example to the :ref:`embeddings visualization section ` of diff --git a/docs/source/teams/data_quality.rst b/docs/source/teams/data_quality.rst new file mode 100644 index 0000000000..3456c323a1 --- /dev/null +++ b/docs/source/teams/data_quality.rst @@ -0,0 +1,204 @@ +.. _data-quality: + +Data Quality +============ + +.. default-role:: code + +**Available in FiftyOne Teams v2.2+** + +The Data Quality panel is a builtin feature of the +:ref:`FiftyOne Teams App ` that automatically scans your dataset +for common quality issues and helps you explore and take action to resolve +them. + +.. _data-quality-home: + +Data Quality panel +__________________ + +You can open the Data Quality panel by clicking the "+" icon next to the +Samples tab. + +The panel's home page shows a list of the available issue types and their +current analysis/review status: + +- **Brightness**: scans for images that are unusually bright or dim +- **Blurriness**: scans for images that are abnormally blurry or sharp +- **Aspect Ratio**: scans for images that have extreme aspect ratios +- **Entropy**: scans for images that have unusually small or large entropy +- **Near Duplicates**: leverages embeddings to scan for + :ref:`near-duplicate samples ` in your dataset +- **Exact Duplicates**: uses filehashes to scan your dataset for duplicate + data with either the same or different filenames + +Click on the right arrow of an issue type's card to open its expanded view. + +.. image:: /images/teams/data_quality_home.png + :alt: data-quality-home + :align: center + +.. _data-quality-scan: + +Scanning for issues +___________________ + +If you have not yet scanned a dataset for a given issue type, you'll see a +landing page like this: + +.. image:: /images/teams/data_quality_brightness_scan.png + :alt: data-quality-brightness-scan + :align: center + +Clicking the "Scan Dataset" button presents two choices for execution: + +.. image:: /images/teams/data_quality_brightness_scan_options.png + :alt: data-quality-brightness-scan-options + :align: center + +.. note:: + + The "Execute" option is **only for testing**. In this mode, computation is + performend synchronously and will timeout if it does not complete within a + few minutes. + + Choose "Schedule" for all production data, which schedules the scan for + :ref:`delegated execution ` on your compute + cluster. + +While a scan is in-progress, you'll see a status page like this: + +.. image:: /images/teams/data_quality_brightness_scheduled.png + :alt: data-quality-brightness-scheduled + :align: center + +Click the link in the notification to navigate to the dataset's +:ref:`Runs page ` where you can monitor +the status of the task. + +.. _data-quality-analyze: + +Analyzing scan results +______________________ + +Once an issue scan is complete, its card will update to display an interactive +histogram that you can use to analyze the findings: + +.. image:: /images/teams/data_quality_brightness_analysis.png + :alt: data-quality-brightness-analysis + :align: center + +.. note:: + + When analyzing issue scan results, we recommend using the split screen icon + to the right of the Samples panel tab to arrange the Samples panel and Data + Quality panel side-by-side, as shown above. + +Each issue type's results are stored under a dedicated field of the dataset, +from which the displayed histograms are generated: + +- **Brightness**: the brightness of each image is stored in a `brightness` + field of the sample +- **Blurriness**: the blurriness of each image is stored in a `blurriness` + field of the sample +- **Aspect Ratio**: the aspect ratio of each image is stored in an + `aspect_ratio` field of the sample +- **Entropy**: the entropy of each image is stored in an `entropy` field of + the sample +- **Near Duplicates**: the nearest neighbor distance of each sample is stored + in a `nearest_neighbor` field of the sample +- **Exact Duplicates**: the filehash of each image is stored in a `filehash` + field of the sample + +Each issue type comes with a default threshold range that highlights potential +issues in your dataset. If issues are identified, the number of potential +issues will be displayed in the top-left corner of the Data Quality panel and +the Samples panel will automatically update to show the corresponding samples +in the grid. + +You can also use the threshold slider to manually explore different threshold +ranges. When you release the slider, the Samples panel will automatically +update to show the corresponding samples: + +.. image:: /images/teams/data_quality_brightness_slider.gif + :alt: data-quality-brightness-slider + :align: center + +If you find a better threshold for a dataset, you can save it via the +"Save Threshold" option under the settings menu. You can use +"Reset Threshold" to the revert to the default threshold at any time. + +Once you've reviewed the potential issues in the grid, you can use the +"Add Tags" button to take action on them. Clicking the button will display a +modal like this: + +.. image:: /images/teams/data_quality_brightness_tag.png + :alt: data-quality-brightness-tag + :align: center + +.. note:: + + If you've selected samples in the grid, only those samples will be tagged. + Otherwise, tags will be added to all samples in your current view (i.e., + all potential issues). + +You can use the "sample tags" filter in the +:ref:`App's sidebar ` to retrieve, review, and act on all +samples that you've previously tagged. + +The review status indicator in the top-right corner of the panel indicates +whether an issue type is currently "In Review" or "Reviewed". You can click on +it at any time to toggle the review status. + +If you navigate away from an issue type that is currently "In Review", you'll +be prompted to indicate whether or not you'd like to mark the issue type as +"Reviewed": + +.. image:: /images/teams/data_quality_brightness_mark_as_reviewed.png + :alt: data-quality-brightness-mark-as-reviewed + :align: center + +.. _data-quality-update: + +Updating a scan +_______________ + +The Data Quality panel gracefully adapts to changes in your datasets after +scans have been performed. + +If you delete samples from a dataset, the +:ref:`histograms ` of any existing scans will +automatically be updated to reflect the new distribution. + +If you add new samples to a dataset or clear some existing field values +associated with a scan (e.g., `brightness` field values for brightness scans), +the panel will automatically detect the presence of unscanned samples and will +display contextual information from the :ref:`home page ` +and :ref:`analysis page `: + +.. image:: /images/teams/data_quality_new_samples_home.png + :alt: data-quality-new-samples-home + :align: center + +To update an existing scan, open the issue type and click the +"Scan New Samples" button in the bottom-right corner of the +:ref:`analysis page `. This will open a modal that +provides additional context and prompts you to initiate the new samples scan: + +.. image:: /images/teams/data_quality_new_samples_modal.png + :alt: data-quality-new-samples-modal + :align: center + +.. _data-quality-delete: + +Deleting a scan +_______________ + +You can delete an issue scan by simply deleting the corresponding field from +the dataset (e.g., `brightness` for brightness scans). + +.. note:: + + Did you know? You can delete sample fields from the App using the + `delete_sample_field` operator available via the + :ref:`Operator browser `. diff --git a/docs/source/teams/index.rst b/docs/source/teams/index.rst index b41846648f..0a69fc6215 100644 --- a/docs/source/teams/index.rst +++ b/docs/source/teams/index.rst @@ -93,11 +93,29 @@ pages on this site apply to Teams deployments as well. :button_link: teams_app.html .. customcalloutitem:: - :header: Data Lens + :header: Data Lens __SUB_NEW__ :description: Use FiftyOne Teams to explore and import samples from external data sources. :button_text: Connect your data lake :button_link: data_lens.html +.. customcalloutitem:: + :header: Data Quality __SUB_NEW__ + :description: Automatically scan your data for quality issues and take action to resolve them. + :button_text: Find quality issues + :button_link: data_quality.html + +.. customcalloutitem:: + :header: Model Evaluation __SUB_NEW__ + :description: Evaluate your models and interactively and visually analyze their performance. + :button_text: Evaluate models + :button_link: ../user_guide/app.html#app-model-evaluation-panel + +.. customcalloutitem:: + :header: Query Performance __SUB_NEW__ + :description: Configure your massive datasets to support fast queries at scale. + :button_text: Fast queries at scale + :button_link: query_performance.html + .. customcalloutitem:: :header: Plugins :description: Learn how to install and manage shared plugins for your Teams deployment. @@ -149,8 +167,9 @@ pages on this site apply to Teams deployments as well. Dataset Versioning FiftyOne Teams App Data Lens __SUB_NEW__ + Data Quality __SUB_NEW__ + Query Performance __SUB_NEW__ Plugins - Query Performance Secrets Management SDK Migrations diff --git a/docs/source/teams/overview.rst b/docs/source/teams/overview.rst index 5acba726dd..25d141f7fe 100644 --- a/docs/source/teams/overview.rst +++ b/docs/source/teams/overview.rst @@ -108,7 +108,7 @@ would with open source FiftyOne: - | :ref:`Using the FiftyOne App ` | :ref:`Creating views into datasets ` | `Embedding-based dataset analysis `_ - | :ref:`Visual similarity and dataset uniqueness ` + | :ref:`Visual similarity ` and :ref:`dataset uniqueness ` * - Annotation - :ref:`Using the annotation API ` * - Model training and evaluation diff --git a/docs/source/teams/query_performance.rst b/docs/source/teams/query_performance.rst index c0f331f79a..462ff71f7b 100644 --- a/docs/source/teams/query_performance.rst +++ b/docs/source/teams/query_performance.rst @@ -1,147 +1,200 @@ .. _query-performance: -Query Performance (NEW) -======================= +Query Performance +================= -Query Performance is a feature built into the :ref:`FiftyOne Teams App ` -which allows users to use FiftyOne to improve the performance of the sidebar and background -queries through the use of indexes and summary fields. +.. default-role:: code -.. _query-performance-how-it-works: +**Available in FiftyOne Teams v2.2+** -Turning on Query Performance -____________________________ - -.. image:: /images/teams/qp_home.png - :alt: query-performance-home-tab - :align: center - -Query Performance is enabled by default in the FiftyOne Teams App. You can toggle -Query Performance by clicking on the "Query Performance" switch in the menu bar. - -.. image:: /images/teams/qp_toggle.png - :alt: query-performance-toggle - :align: center +Query Performance is a builtin feature of the +:ref:`FiftyOne Teams App ` that leverages database indexes to +optimize your queries on large-scale datasets. -Within the Query Performance panel, you can see the status of Query Performance mode and turn -the mode on/off by clicking on the gear icon. +Optimizing Query Performance +____________________________ -.. image:: /images/teams/qp_config.png - :alt: query-performance-config - :align: center +The App's sidebar is optimized to leverage database indexes whenever possible. -There is also a helpful tooltip when the user hovers over the gold lightning bolt icon -in the side bar. The tooltip will show a button to open Query Performance panel; if user -clicks on the `Got it` button the tooltip will be permanently dismissed. +Fields that are indexed are indicated by lightning bolt icons next to their +field/attribute names: -.. image:: /images/teams/qp_tooltip.png - :alt: query-performance-tooltip +.. image:: /images/app/app-query-performance.gif + :alt: app-query-performance :align: center -Admin Configuration: +The above GIF shows Query Performance in action on the train split of the +:ref:`BDD100K dataset ` with an index on the +`detections.detections.label` field. -- ``FIFTYONE_APP_DEFAULT_QUERY_PERFORMANCE``: Set to ``false`` to change the default setting for all users -- ``FIFTYONE_APP_ENABLE_QUERY_PERFORMANCE``: Set to ``false`` to completely disable the feature for all users +.. note:: -Query Performance Toast ------------------------ + When filtering by multiple fields, queries will be more efficient when your + **first** filter is on an indexed field. -When you open the FiftyOne Teams App with Query Performance enabled, you will see a toast -notification whenever a sidebar query is run that could benefit from Query Performance. For -example you can click on a label filter on the sidebar, and if the filter takes longer than -a few seconds to load, the toast will be opened. +If you perform a filter that could benefit from an index and the query takes +longer than a few seconds, you'll see a toast notification that nudges you to +take the appropriate action to optimize the query: .. image:: /images/teams/qp_toast.png :alt: query-performance-toast :align: center -The toast notification will show you two options: "Create Index" and "Dismiss". -Clicking "Create Index" will open the Query Performance panel where you can create an index. +Clicking "Create Index" will open the +:ref:`Query Performance panel ` with a preconfigured +recommendation of an :ref:`index ` or +:ref:`summary field ` to create. + +.. note:: -Clicking "Dismiss" will close the toast notification for all datasets for the current session. -Users can also close the toast notification by clicking outside the toast notification. The -toast notification will also close automatically after a few seconds. + Clicking "Dismiss" will prevent this notification from appearing for the + remainder of your current App session. -Query Performance Panel +.. _query-performance-panel: + +Query Performance panel _______________________ -The Query Performance panel can be accessed through: +You can open the Query Performance panel manually either by clicking the "+" +icon next to the Samples tab or by clicking the yellow lightning bolt in the +top-right of the sidbar: + +.. image:: /images/teams/qp_tooltip.png + :alt: query-performance-tooltip + :align: center -- The panel menu -- The "Create Index" button +The first time you open the Query Performance panel, you'll see a welcome page: -Each dataset includes default indexes created at initialization. The panel displays a table showing: +.. image:: /images/teams/qp_home.png + :alt: query-performance-home-tab + :align: center -- All indexes (default and custom) -- Index sizes -- Available actions: - - ``Drop Index`` - - ``Drop Summary Field`` - - ``Refresh Summary Field`` (only for summary fields) +After you've created at least one custom index or summary field for a dataset, +you'll instead see a list of the indexes and summary fields that exist on the +dataset: -Custom indexes can be created via the panel, SDK client, or MongoDB client. .. image:: /images/teams/qp_tableview.png :alt: query-performance-tableview :align: center -Create Index ------------- +.. _query-performance-index: + +Creating indexes +---------------- -The Query Performance panel shows the query that could benefit from an index. You can create an -index by clicking the `Create Index` button. The index will be created in the background and you -will see the progress of the index creation in the Query Performance panel. You can create multiple -indexes at the same time. For each index, users can also have the option to add Unique constraint. +You can create a new index at any time by clicking the `Create Index` button +in the top-right of the panel: .. image:: /images/teams/qp_create_index.png :alt: query-performance-create-index :align: center +When you click "Execute", the index will be initiated and you'll see +"In progress" in the panel's summary table. + +After the index creation has finished, the field that you indexed will have a +lightning bolt icon in the sidebar, and you should notice that expanding the +field's filter widget and performing queries on it will be noticably faster. + .. warning:: -For large, complex datasets, index creation can have an impact on the performance of the database. -It is recommended to consult and communicate with your database administrator and teammates -before attempting such an operation. -After the indexes are created, the fields with index will be highlighted in the sidebar with a lightning -bolt icon. Expanding the side bar filter for indexed fields will be noticeably faster. + For large datasets, index creation can have a significant impact on the + performance of the database while the index is under construction. + + We recommend indexing *only* the specific fields that you wish to perform + initial filters on, and we recommend consulting with your deployment admin + before creating multiple indexes simultaneously. -Create Summary Field --------------------- +You can also create and manage custom indexes +:ref:`via the SDK `. -The Query Performance panel also allows users to create a summary field. Summary fields are sample-level fields that -are computed and stored in the database. For example, users can create a summary field for objects detected in every -frame. This allows users to filter quickly across the dataset to find samples with the desired objects. +.. _query-performance-summary: + +Creating summary fields +----------------------- + +The Query Performance panel also allows you to create +:ref:`summary fields `, which are sample-level fields that +allow you to efficiently perform queries on large datasets where directly +querying the underlying field is prohibitively slow due to the number of +objects/frames in the field. + +For example, summary fields can help you query video datasets to find samples +that contain specific classes of interest, eg `person`, in at least one frame. + +You can create a new summary field at any time by clicking the `Create Index` +button in the top-right of the panel and selecting the "Summary field" type in +the model: .. image:: /images/teams/qp_create_summary_field.png :alt: query-performance-create-summary-field :align: center -The summary field is also enhanced with relevant indexes to improve its performance. Users can choose to remove the -summary field by clicking the `Drop Index/Field` action in the table. Users can also choose to remove the individual -indexes associated with the summary field. +.. warning:: + + For large datasets, creating summary fields can take a few minutes. + +You can also create and manage summary fields +:ref:`via the SDK `. + +.. _query-performance-update: + +Updating summary fields +----------------------- + +Since a :ref:`summary field ` is derived from the contents of +another field, it must be updated whenever there have been modifications to its +source field. + +Click the update icon in the actions column of any summary field to open a +modal that will provide guidance on whether to update the summary field to +reflect recent dataset changes. + +.. _query-performance-delete: -Performance Considerations +Deleting indexes/summaries -------------------------- -.. warning:: +You can delete a custom index or summary field by clicking its trash can icon +in the actions column of the panel. + +.. _query-performance-disable: + +Disabling Query Performance +___________________________ + +Query Performance is enabled by default for all datasets. This is generally the +recommended setting for all large datasets to ensure that queries are +performant. + +However, in certain circumstances you may prefer to disable Query Performance, +which enables the App's sidebar to show additional information such as +label/value counts that are useful but more expensive to compute. + +You can enable/disable Query Performance for a particular dataset for its +lifetime (in your current browser) via the gear icon in the Samples panel's +actions row: + +.. image:: /images/app/app-query-performance-disabled.gif + :alt: app-query-performance-disabled + :align: center + +You can also enable/disable Query Performance via the status button in the +upper right corner of the Query Performance panel: + +.. image:: /images/teams/qp_config.png + :alt: query-performance-config + :align: center - For large datasets, the following operations may take significant time to complete: - - - Creating summary fields - - Updating summary fields - - Deleting summary fields - - Additionally: - - - Deleting an index or summary field will remove its performance benefits - - These operations cannot be cancelled once started - - Plan these operations during low-usage periods +Deployment admins can also configure the global behavior of Query Performance +via the following environment variables: -Update Summary Field --------------------- +.. code-block:: shell -Summary fields can be updated via the ``Refresh Summary Field`` action to reflect recent dataset changes. + # Disable Query Performance by default for all new datasets + FIFTYONE_APP_DEFAULT_QUERY_PERFORMANCE=false -Delete Index and Field ----------------------- +.. code-block:: shell -Use ``Drop Index`` or ``Drop Summary Field`` actions to remove indexes or summary fields from the dataset. + # Completely disable Query Performance for all users + FIFTYONE_APP_ENABLE_QUERY_PERFORMANCE=false diff --git a/docs/source/user_guide/app.rst b/docs/source/user_guide/app.rst index a323876c8c..f1c89dc842 100644 --- a/docs/source/user_guide/app.rst +++ b/docs/source/user_guide/app.rst @@ -400,32 +400,16 @@ only those samples and/or labels that match the filter. Optimizing query performance ---------------------------- -By default, the sidebar filters are optimized to use indexes when no view is -present. Filters that do have an index are highlighted with the lightning bolt -icon. Query performance can be disabled by default via -`default_query_performance` in your -:ref:`App config `. - -When a view is present and indexes are no longer applicable, granular filter -widgets are shown that include comphrensive counts. Granular filters can be -toggled for a dataset via the settings "Gear" or -`enable_query_performance` and `disabled_query_performance` operators. - -.. image:: /images/app/app-granular.gif - :alt: app-granular - :align: center - -.. note:: +The App's sidebar is optimized to leverage database indexes whenever possible. - When query performance mode is toggled through the "Gear" icon or - `enable_query_performance` and `disabled_query_performance` operators - the setting is persisted in your browser for that dataset +Fields that are indexed are indicated by lightning bolt icons next to their +field/attribute names: -.. image:: /images/app/app-query-performance-mode.gif - :alt: app-query-performance-mode +.. image:: /images/app/app-query-performance.gif + :alt: app-query-performance :align: center -The above GIF shows query performance mode in action on the train split of the +The above GIF shows query performance in action on the train split of the :ref:`BDD100K dataset ` with an index on the `detections.detections.label` field: @@ -448,6 +432,11 @@ The above GIF shows query performance mode in action on the train split of the session = fo.launch_app(dataset) +.. note:: + + When filtering by multiple fields, queries will be more efficient when your + **first** filter is on an indexed field. + The SDK provides a number of useful utilities for managing indexes on your datasets: @@ -462,8 +451,9 @@ datasets: .. note:: - Did you know? Teams customers can manage dataset indexes via the App with the builtin Query Performance panel! - See :ref:`this page ` for more information. + Did you know? With :ref:`FiftyOne Teams ` you can manage + indexes natively in the App via the + :ref:`Query Performance panel `. In general, we recommend indexing *only* the specific fields that you wish to perform initial filters on: @@ -492,14 +482,12 @@ perform initial filters on: .. note:: - Frame fields are not directly optimizable. Use - :ref:`summary fields ` to efficiently query frame-level - information on large video datasets + Filtering by frame fields of video datasets is not directly optimizable by + creating indexes. Instead, use :ref:`summary fields ` to + efficiently query frame-level information on large video datasets. -.. note:: - - Frame filtering for the grid can be completely disabled via the - `disable_frame_filtering` setting in + Frame filtering in the App's grid view can be disabled by setting + `disable_frame_filtering=True` in your :ref:`App config `. For :ref:`grouped datasets `, you should create two indexes for each @@ -554,6 +542,30 @@ field: Numeric field filters are not supported by wildcard indexes. +.. _app-disasbling-query-performance: + +Disabling query performance +--------------------------- + +Query performance is enabled by default for all datasets. This is generally the +recommended setting for all large datasets to ensure that queries are +performant. + +However, in certain circumstances you may prefer to disable query performance, +which enables the App's sidebar to show additional information such as +label/value counts that are useful but more expensive to compute. + +You can disable query performance for a particular dataset for its lifetime +(in your current browser) via the gear icon in the Samples panel's actions row: + +.. image:: /images/app/app-query-performance-disabled.gif + :alt: app-query-performance-disabled + :align: center + +You can also disable query performance by default for all datasets by setting +`default_query_performance=False` in your +:ref:`App config `. + .. _app-sidebar-groups: Sidebar groups @@ -1302,12 +1314,14 @@ FiftyOne natively includes the following Panels: - :ref:`Samples panel `: the media grid that loads by default when you launch the App -- :ref:`Histograms panel `: a dashboard of histograms - for the fields of your dataset - :ref:`Embeddings panel `: a canvas for working with :ref:`embeddings visualizations ` +- :ref:`Model Evaluation panel `: interactively + analyze and visualize your model's performance - :ref:`Map panel `: visualizes the geolocation data of datasets that have a |GeoLocation| field +- :ref:`Histograms panel `: a dashboard of histograms + for the fields of your dataset .. note:: @@ -1624,8 +1638,8 @@ _____________ By default, when you launch the App, your spaces layout will contain a single space with the Samples panel active: -.. image:: /images/app/app-histograms-panel.gif - :alt: app-histograms-panel +.. image:: /images/app/app-samples-panel.gif + :alt: app-samples-panel :align: center When configuring spaces :ref:`in Python `, you can create a @@ -1636,49 +1650,6 @@ Samples panel as follows: samples_panel = fo.Panel(type="Samples") -.. _app-histograms-panel: - -Histograms panel -________________ - -The Histograms panel in the App lets you visualize different statistics about -the fields of your dataset. - -- The `Sample tags` and `Label tags` modes show the distribution of any - :ref:`tags ` that you've added to your dataset -- The `Labels` mode shows the class distributions for each - :ref:`labels field ` that you've added to your dataset. For - example, you may have histograms of ground truth labels and one more sets - of model predictions -- The `Other fields` mode shows distributions for numeric (integer or float) - or categorical (e.g., string) - :ref:`primitive fields ` that you've added to your - dataset. For example, if you computed - :ref:`uniqueness ` on your dataset, a histogram of - uniqueness values will be available under this mode. - -.. note:: - - The statistics in the plots automatically update to reflect the current - :ref:`view ` that you have loaded in the App! - -.. image:: /images/app/app-histograms-panel.gif - :alt: app-histograms-panel - :align: center - -When configuring spaces :ref:`in Python `, you can define a -Histograms panel as follows: - -.. code-block:: python - :linenos: - - histograms_panel = fo.Panel(type="Histograms", state=dict(plot="Labels")) - -The Histograms panel supports the following `state` parameters: - -- **plot**: the histograms to plot. Supported values are `"Sample tags"`, - `"Label tags"`, `"Labels"`, and `"Other fields"` - .. _app-embeddings-panel: Embeddings panel @@ -1723,6 +1694,12 @@ samples/patches in the Samples panel: :alt: app-embeddings-panel :align: center +.. note:: + + Did you know? With :ref:`FiftyOne Teams ` you can generate + embeddings visualizations natively from the App + :ref:`in the background ` while you work. + The embeddings UI also provides a number of additional controls: - Press the `pan` icon in the menu (or type `g`) to switch to pan mode, in @@ -1763,6 +1740,139 @@ The Embeddings panel supports the following `state` parameters: - **colorByField**: an optional sample field (or label attribute, for patches embeddings) to color the points by +.. _app-model-evaluation-panel: + +Model Evaluation panel __SUB_NEW__ +__________________________________ + +When you load a dataset in the App that contains one or more +:ref:`evaluations `, you can open the Model Evaluation panel +to visualize and interactively explore the evaluation results in the App: + +.. code-block:: python + :linenos: + + import fiftyone as fo + import fiftyone.zoo as foz + + dataset = foz.load_zoo_dataset("quickstart") + + # Evaluate the objects in the `predictions` field with respect to the + # objects in the `ground_truth` field + results = dataset.evaluate_detections( + "predictions", + gt_field="ground_truth", + eval_key="eval", + ) + + session = fo.launch_app(dataset) + +The panel's home page shows a list of evaluation on the dataset, their current +review status, and any evaluation notes that you've added. Click on an +evaluation to open its expanded view, which provides a set of expandable cards +that dives into various aspects of the model's performance: + +.. image:: /images/app/model-evaluation-open.gif + :alt: model-evaluation-open + :align: center + +.. note:: + + Did you know? With :ref:`FiftyOne Teams ` you can execute + model evaluations natively from the App + :ref:`in the background ` while you work. + +Review status +------------- + +You can use the status pill in the upper right-hand corner of the panel to +toggle an evaluation between `Needs Review`, `In Review`, and `Reviewed`: + +.. image:: /images/app/model-evaluation-review.gif + :alt: model-evaluation-review + :align: center + +Evaluation notes +---------------- + +The Evaluation Notes card provides a place to add your own Markdown-formatted +notes about the model's performance: + +.. image:: /images/app/model-evaluation-notes.gif + :alt: model-evaluation-notes + :align: center + +Summary +------- + +The Summary card provides a table of common model performance metrics. You can +click on the grid icons next to TP/FP/FN to load the corresponding labels in +the Samples panel: + +.. image:: /images/app/model-evaluation-summary.gif + :alt: model-evaluation-summary + :align: center + +Metric performance +------------------ + +The Metric Performance card provides a graphical summary of key model +performance metrics: + +.. image:: /images/app/model-evaluation-metric.gif + :alt: model-evaluation-metric + :align: center + +Class performance +----------------- + +The Class Performance card provides a per-class breakdown of each model +performance metric. If an evaluation contains many classes, you can use the +settings menu to control which classes are shown. The histograms are also +interactive: you can click on bars to show the corresponding labels in the +Samples panel: + +.. image:: /images/app/model-evaluation-class.gif + :alt: model-evaluation-class + :align: center + +Confusion matrices +------------------ + +The Confusion Matrices card provides an interactive confusion matrix for the +evaluation. If an evaluation contains many classes, you can use the settings +menu to control which classes are shown. You can also click on cells to show +the corresponding labels in the Samples panel: + +.. image:: /images/app/model-evaluation-confusion.gif + :alt: model-evaluation-confusion + :align: center + +Comparing models +---------------- + +When a dataset contains multiple evaluations, you can compare two model's +performance by selecting a "Compare against" key: + +.. code-block:: python + :linenos: + + model = foz.load_zoo_model("yolo11s-coco-torch") + + dataset.apply_model(model, label_field="predictions_yolo11") + + dataset.evaluate_detections( + "predictions_yolo11", + gt_field="ground_truth", + eval_key="eval_yolo11", + ) + + session.refresh() + +.. image:: /images/app/model-evaluation-compare.gif + :alt: model-evaluation-compare + :align: center + .. _app-map-panel: Map panel @@ -1892,6 +2002,49 @@ the above values on a :ref:`dataset's App config `: Dataset-specific plugin settings will override any settings from your :ref:`global App config `. +.. _app-histograms-panel: + +Histograms panel +________________ + +The Histograms panel in the App lets you visualize different statistics about +the fields of your dataset. + +- The `Sample tags` and `Label tags` modes show the distribution of any + :ref:`tags ` that you've added to your dataset +- The `Labels` mode shows the class distributions for each + :ref:`labels field ` that you've added to your dataset. For + example, you may have histograms of ground truth labels and one more sets + of model predictions +- The `Other fields` mode shows distributions for numeric (integer or float) + or categorical (e.g., string) + :ref:`primitive fields ` that you've added to your + dataset. For example, if you computed + :ref:`uniqueness ` on your dataset, a histogram of + uniqueness values will be available under this mode. + +.. note:: + + The statistics in the plots automatically update to reflect the current + :ref:`view ` that you have loaded in the App! + +.. image:: /images/app/app-histograms-panel.gif + :alt: app-histograms-panel + :align: center + +When configuring spaces :ref:`in Python `, you can define a +Histograms panel as follows: + +.. code-block:: python + :linenos: + + histograms_panel = fo.Panel(type="Histograms", state=dict(plot="Labels")) + +The Histograms panel supports the following `state` parameters: + +- **plot**: the histograms to plot. Supported values are `"Sample tags"`, + `"Label tags"`, `"Labels"`, and `"Other fields"` + .. _app-select-samples: Selecting samples diff --git a/docs/source/user_guide/evaluation.rst b/docs/source/user_guide/evaluation.rst index 2f53e89a47..57e25d5b11 100644 --- a/docs/source/user_guide/evaluation.rst +++ b/docs/source/user_guide/evaluation.rst @@ -9,20 +9,12 @@ FiftyOne provides a variety of builtin methods for evaluating your model predictions, including regressions, classifications, detections, polygons, instance and semantic segmentations, on both image and video datasets. -.. note:: - - Did you know? You can evaluate models from within the FiftyOne App by - installing the - `@voxel51/evaluation `_ - plugin! - When you evaluate a model in FiftyOne, you get access to the standard aggregate metrics such as classification reports, confusion matrices, and PR curves for your model. In addition, FiftyOne can also record fine-grained statistics like accuracy and false positive counts at the sample-level, which you can -leverage via :ref:`dataset views ` and the -:ref:`FiftyOne App ` to interactively explore the strengths and -weaknesses of your models on individual data samples. +:ref:`interactively explore ` in the App to diagnose +the strengths and weaknesses of your models on individual data samples. Sample-level analysis often leads to critical insights that will help you improve your datasets and models. For example, viewing the samples with the @@ -53,22 +45,38 @@ method: .. code-block:: python :linenos: + import fiftyone as fo import fiftyone.zoo as foz dataset = foz.load_zoo_dataset("quickstart") - print(dataset) # Evaluate the objects in the `predictions` field with respect to the # objects in the `ground_truth` field results = dataset.evaluate_detections( "predictions", gt_field="ground_truth", - eval_key="eval_predictions", + eval_key="eval", ) -Aggregate metrics + session = fo.launch_app(dataset) + +Model Evaluation panel __SUB_NEW__ +---------------------------------- + +When you load a dataset in the App that contains one or more +:ref:`evaluations `, you can open the +:ref:`Model Evaluation panel ` to visualize and +interactively explore the evaluation results in the App: + +.. image:: /images/app/model-evaluation-compare.gif + :alt: model-evaluation-compare + :align: center + +Per-class metrics ----------------- +You can also retrieve and interact with evaluation results via the SDK. + Running an evaluation returns an instance of a task-specific subclass of |EvaluationResults| that provides a handful of methods for generating aggregate statistics about your dataset. @@ -102,14 +110,13 @@ statistics about your dataset. macro avg 0.27 0.57 0.35 1311 weighted avg 0.42 0.68 0.51 1311 - .. note:: + For details on micro, macro, and weighted averaging, see the `sklearn.metrics documentation `_. - -Sample metrics --------------- +Per-sample metrics +------------------ In addition to standard aggregate metrics, when you pass an ``eval_key`` parameter to the evaluation routine, FiftyOne will populate helpful @@ -131,8 +138,8 @@ dataset: # only includes false positive boxes in the `predictions` field view = ( dataset - .sort_by("eval_predictions_fp", reverse=True) - .filter_labels("predictions", F("eval_predictions") == "fp") + .sort_by("eval_fp", reverse=True) + .filter_labels("predictions", F("eval") == "fp") ) # Visualize results in the App @@ -160,34 +167,17 @@ real performance of a model. Confusion matrices ------------------ -When you use evaluation methods such as -:meth:`evaluate_classifications() ` -and -:meth:`evaluate_detections() ` -to evaluate model predictions, the confusion matrices that you can generate -by calling the -:meth:`plot_confusion_matrix() ` -method are responsive plots that can be attached to App instances to -interactively explore specific cases of your model's performance. - .. note:: - See :ref:`this section ` for more information about - interactive confusion matrices in FiftyOne. + The easiest way to work with confusion matrices in FiftyOne is via the + :ref:`Model Evaluation panel `! -Continuing with our example, the code block below generates a confusion matrix -for our evaluation results and :ref:`attaches it to the App `. - -In this setup, you can click on individual cells of the confusion matrix to -select the corresponding ground truth and/or predicted objects in the App. For -example, if you click on a diagonal cell of the confusion matrix, you will -see the true positive examples of that class in the App. - -Likewise, whenever you modify the Session's view, either in the App or by -programmatically setting -:meth:`session.view `, the confusion matrix -is automatically updated to show the cell counts for only those objects that -are included in the current view. +When you use evaluation methods such as +:meth:`evaluate_detections() ` +that support confusion matrices, you can use the +:meth:`plot_confusion_matrix() ` +method to render responsive plots that can be attached to App instances to +interactively explore specific cases of your model's performance: .. code-block:: python :linenos: @@ -203,6 +193,17 @@ are included in the current view. :alt: detection-evaluation :align: center +In this setup, you can click on individual cells of the confusion matrix to +select the corresponding ground truth and/or predicted objects in the App. For +example, if you click on a diagonal cell of the confusion matrix, you will +see the true positive examples of that class in the App. + +Likewise, whenever you modify the Session's view, either in the App or by +programmatically setting +:meth:`session.view `, the confusion matrix +is automatically updated to show the cell counts for only those objects that +are included in the current view. + .. _managing-evaluations: Managing evaluations @@ -228,22 +229,22 @@ The example below demonstrates the basic interface: # List evaluations you've run on a dataset dataset.list_evaluations() - # ['eval_predictions'] + # ['eval'] # Print information about an evaluation - print(dataset.get_evaluation_info("eval_predictions")) + print(dataset.get_evaluation_info("eval")) # Load existing evaluation results and use them - results = dataset.load_evaluation_results("eval_predictions") + results = dataset.load_evaluation_results("eval") results.print_report() # Rename the evaluation # This will automatically rename any evaluation fields on your dataset - dataset.rename_evaluation("eval_predictions", "eval") + dataset.rename_evaluation("eval", "still_eval") # Delete the evaluation # This will remove any evaluation data that was populated on your dataset - dataset.delete_evaluation("eval") + dataset.delete_evaluation("still_eval") The sections below discuss evaluating various types of predictions in more detail. @@ -490,10 +491,8 @@ to it to demonstrate the workflow: .. note:: - Did you know? You can - :ref:`attach confusion matrices to the App ` and - interactively explore them by clicking on their cells and/or modifying your - view in the App. + The easiest way to analyze models in FiftyOne is via the + :ref:`Model Evaluation panel `! Top-k evaluation ---------------- @@ -576,6 +575,11 @@ from a pre-trained model from the :ref:`Model Zoo `: :alt: imagenet-top-k-eval :align: center +.. note:: + + The easiest way to analyze models in FiftyOne is via the + :ref:`Model Evaluation panel `! + Binary evaluation ----------------- @@ -671,6 +675,11 @@ added to it to demonstrate the workflow: :alt: cifar10-binary-pr-curve :align: center +.. note:: + + The easiest way to analyze models in FiftyOne is via the + :ref:`Model Evaluation panel `! + .. _evaluating-detections: Detections @@ -1027,6 +1036,11 @@ The example below demonstrates COCO-style detection evaluation on the :alt: quickstart-evaluate-detections :align: center +.. note:: + + The easiest way to analyze models in FiftyOne is via the + :ref:`Model Evaluation panel `! + mAP and PR curves ~~~~~~~~~~~~~~~~~ @@ -1098,12 +1112,6 @@ ground truth objects of different classes. :alt: coco-confusion-matrix :align: center -.. note:: - - Did you know? :ref:`Confusion matrices ` can be - attached to your |Session| object and dynamically explored using FiftyOne's - :ref:`interactive plotting features `! - .. _evaluating-detections-open-images: Open Images-style evaluation @@ -1257,6 +1265,11 @@ The example below demonstrates Open Images-style detection evaluation on the :alt: quickstart-evaluate-detections-oi :align: center +.. note:: + + The easiest way to analyze models in FiftyOne is via the + :ref:`Model Evaluation panel `! + mAP and PR curves ~~~~~~~~~~~~~~~~~ @@ -1331,12 +1344,6 @@ matched with ground truth objects of different classes. :alt: oi-confusion-matrix :align: center -.. note:: - - Did you know? :ref:`Confusion matrices ` can be - attached to your |Session| object and dynamically explored using FiftyOne's - :ref:`interactive plotting features `! - .. _evaluating-detections-activitynet: ActivityNet-style evaluation (default temporal) @@ -1479,6 +1486,11 @@ on the :ref:`ActivityNet 200 dataset `: :alt: activitynet-evaluate-detections :align: center +.. note:: + + The easiest way to analyze models in FiftyOne is via the + :ref:`Model Evaluation panel `! + mAP and PR curves ~~~~~~~~~~~~~~~~~ @@ -1598,12 +1610,6 @@ matched with ground truth segments of different classes. :alt: activitynet-confusion-matrix :align: center -.. note:: - - Did you know? :ref:`Confusion matrices ` can be - attached to your |Session| object and dynamically explored using FiftyOne's - :ref:`interactive plotting features `! - .. _evaluating-segmentations: Semantic segmentations @@ -1733,6 +1739,11 @@ masks generated by two DeepLabv3 models (with :alt: evaluate-segmentations :align: center +.. note:: + + The easiest way to analyze models in FiftyOne is via the + :ref:`Model Evaluation panel `! + .. _evaluation-advanced: Advanced usage diff --git a/docs/source/user_guide/index.rst b/docs/source/user_guide/index.rst index 6482dfe9e3..ca21e262c3 100644 --- a/docs/source/user_guide/index.rst +++ b/docs/source/user_guide/index.rst @@ -38,6 +38,12 @@ on your data quickly and easily. :button_text: Learn more about using dataset views :button_link: using_views.html +.. customcalloutitem:: + :header: Grouped datasets + :description: Use grouped datasets to represent your multiview image, video, and point cloud data. + :button_text: Learn more about grouped datasets + :button_link: groups.html + .. customcalloutitem:: :header: Using the App :description: Visualize your datasets in the FiftyOne App and interactively search, sort, and filter them. @@ -45,10 +51,16 @@ on your data quickly and easily. :button_link: app.html .. customcalloutitem:: - :header: Grouped datasets - :description: Use grouped datasets to represent your multiview image, video, and point cloud data. - :button_text: Learn more about grouped datasets - :button_link: groups.html + :header: Annotating datasets + :description: Use builtin or custom integrations to add or edit labels on your FiftyOne datasets. + :button_text: Learn more about annotations + :button_link: annotation.html + +.. customcalloutitem:: + :header: Evaluating models __SUB_NEW__ + :description: Use FiftyOne's builtin methods to evaluate your models and analyze their strengths and weaknesses. + :button_text: Learn more about evaluating models + :button_link: evaluation.html .. customcalloutitem:: :header: Using aggregations @@ -62,18 +74,6 @@ on your data quickly and easily. :button_text: Dive into interactive plotting :button_link: plots.html -.. customcalloutitem:: - :header: Annotating datasets - :description: Use builtin or custom integrations to add or edit labels on your FiftyOne datasets. - :button_text: Learn more about annotations - :button_link: annotation.html - -.. customcalloutitem:: - :header: Evaluating models - :description: Use FiftyOne's builtin methods to evaluate your models and analyze their strengths and weaknesses. - :button_text: Learn more about evaluating models - :button_link: evaluation.html - .. customcalloutitem:: :header: Exporting datasets :description: Export datasets to disk in any number of common formats, or in your own custom format. @@ -111,10 +111,10 @@ on your data quickly and easily. Dataset views Using the App Grouped datasets + Annotating datasets + Evaluating models __SUB_NEW__ Using aggregations Interactive plots - Annotating datasets - Evaluating models Exporting datasets Drawing labels on samples Configuring FiftyOne