Skip to content

Commit

Permalink
Merge pull request #720 from reportportal/develop
Browse files Browse the repository at this point in the history
release
  • Loading branch information
Vadim73i authored Apr 9, 2024
2 parents d89f279 + 3023361 commit f62d02e
Show file tree
Hide file tree
Showing 11 changed files with 222 additions and 302 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,25 @@

The documentation built with [Docusaurus](https://docusaurus.io).

The search is implemented using [DocSearch](https://docsearch.algolia.com).
The search is implemented using [Algolia DocSearch](https://docsearch.algolia.com).

The OpenAPI documentation is generated using
[PaloAltoNetworks docusaurus-openapi-docs plugin](https://github.com/PaloAltoNetworks/docusaurus-openapi-docs).

## Running locally

1. Install the dependencies
```bash
```console
npm install
```

2. Run application in development mode
```bash
```console
npm run start
```

3. For production ready build use the next commands:
```bash
```console
npm run gen-all
npm run build
```
Expand Down
41 changes: 23 additions & 18 deletions docs/analysis/HowModelsAreRetrained.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,28 +5,33 @@ sidebar_label: How models are retrained

# How models are retrained

In the Auto-analysis and ML suggestions processes several models take part:
* Auto-analysis XGBoost model, which gives the probability for a test item to be of a specific type based on the most similar test item in the history with this defect type
* ML suggestions XGBoost model, which gives the probability for a test item to be similar to the test item from the history
* Error message language model on Tf-Idf vectors(Random Forest Classifier), which gives a probability for the error message to be of a specific defect type or its subtype based on the words in the message. The probability from this model is taken as a feature in the main boosting algorithm.
In the Auto-analysis and Machine Learning (ML) suggestions processes, several models contribute:

At the start of the project, you have global models. They were trained on 6 projects and were validated to give a good accuracy on average. To have a more powerful and personalized test failure analysis, the models should be retrained on the data from the project.
* Auto-analysis XGBoost model, which provides the likelihood of a test item being of a specific type based on the most similar test item in the history of that defect type.
* ML suggestions XGBoost model, which offers the probability for a test item to resemble the test item from the history.
* Error message language model on Tf-Idf vectors (Random Forest Classifier), which delivers a probability for the error message to be of a specific defect type or its subtype based on the words in the message. The probability from this model serves as a feature in the main boosting algorithm.

In the beginning, you have global models at your disposal. These models, trained on six projects, have demonstrated average good accuracy. To develop a more powerful and personalized test failure analysis, you should retrain the models on the data from your project. he start of the project, you have global models. They were trained on 6 projects and were validated to give a good accuracy on average. To have a more powerful and personalized test failure analysis, the models should be retrained on the data from the project.

:::note
If a global model performs better on your data, the retrained model won't be saved. As far as we save a custom model only if it performs better for your data than the global one.
If a global model performs better on your project data, the retrained model will not be saved, because we only keep custom models that outperform the global model on your data.
:::

Triggering information and retrained models are saved in Minio(or a filesystem) as you set up in the Analyzer service settings.
Triggering information and retrained models are stored in Minio (or a filesystem), as set up in the Analyzer service settings.

Conditions for triggering retraining for **Error Message Random Forest Classifier**:
* Each time the test item defect type is changed to another issue type (except "To Investigate"), we update the triggering info. This stores the quantity of test items with defect types and the quantity of test items with defect types since the last training. This information is stored in the file "defect_type_trigger_info" in Minio.
* Retraining is triggered when over 100 labelled items are detected and 100 labelled test items have been identified since the last training. If validation data metrics are better than metrics for the global model on the same data points, a custom "defect_type" model is saved in Minio. This will then be utilized in the auto-analysis and suggestions functionality for enhancing test automation results dashboard.


Conditions for triggering retraining of **Auto-analysis** and **Suggestion XGBoost models**:
* We collect training data from several sources:
* When a suggestion is selected (the chosen test item will be a positive example, others will be negative).
* When you don't select any suggestion and manually edit the test item (all suggestions become negative examples).
* Auto-analysis identifies a similar test item; this is considered a positive example unless the defect type is manually changed by the user.

Retraining triggering conditions for **Error message Random Forest Classifier**:
* Each time the test item defect type is changed to another issue type(except "To Investigate"), we update the triggering info, which saves the quantity of test items with defect types and the quantity of test items with defect types since the last training. This information is saved in the file "defect_type_trigger_info" in Minio.
* When we have more than 100 labelled items, and since last training we have 100 labelled test items, retraining is triggered and if validation data metrics are better than metrics for a global model for the same data points, then we save a custom "defect_type" model in Minio and use it further in the auto-analysis and suggestions functionality.
When either a suggestion analysis runs or a defect type change occurs, we update the trigger info for both models. This information is stored in "auto_analysis_trigger_info" and "suggestion_trigger_info" files in Minio.

Retraining triggering conditions for **Auto-analysis** and **Suggestion XGBoost models**:
* We gather training data for training from several sources:
* when you choose one of the suggestions(the chosen test item will be a positive example, others will be negative ones)
* when you don't choose any suggestion and edit the test item somehow(set up a defect type manually, add a comment, etc.), all suggestions become negative examples;
* when auto-analysis runs and for a test item it finds a similar test item, we consider it a positive example, until the user changes the defect type for it manually. In this case, the result will be marked as a negative one.
* Each time a suggestion analysis runs or changing a defect type happens, we update the triggering info for both models. This information is saved in the files "auto_analysis_trigger_info" and "suggestion_tgrigger_info" in Minio.
* When we have more than 300 labelled items, and since last training we have 100 labelled test items, retraining is triggered and if validation data metrics are better than metrics for a global model for the same data points, then we save a custom "auto_anlysis" model in Minio and use it further in the auto-analysis functionality.
* When we have more than 100 labelled items, and since last training we have 50 labelled test items, retraining is triggered and if validation data metrics are better than metrics for a global model for the same data points, then we save a custom "suggestion" model in Minio and use it further in the suggestions functionality.
Retraining is triggered when:
* For the auto-analysis model: when over 300 labelled items and 100 labelled test items have been identified since the last training. If validation data metrics are improved, a custom "auto_analysis" model is saved in Minio and utilized in the auto-analysis function.
* For the suggestion model: when more than 100 labelled items and 50 labelled items have been identified since the last training. If validation data metrics are improved, a custom "suggestion" model is saved in Minio and used in the suggestions function.
4 changes: 2 additions & 2 deletions docs/api/versioned_sidebars/api-sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ const apiSidebars: SidebarsConfig = {
link: {
type: 'generated-index',
title: 'Service API',
description: 'This is a generated index of the ReportPortal Authtorization API.',
description: 'This is a generated index of the ReportPortal Service API.',
slug: '/category/api/service-api-5.10'
},
items: require('../service-api/5.10/sidebar.ts')
items: require('../service-api/versions/5.10/sidebar.ts')
}
],
};
Expand Down
2 changes: 1 addition & 1 deletion docs/api/versioned_sidebars/uat-sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ const uatSidebars: SidebarsConfig = {
description: 'This is a generated index of the ReportPortal Authtorization API.',
slug: '/category/api/service-uat-5.10'
},
items: require('../service-uat/5.10/sidebar.ts')
items: require('../service-uat/versions/5.10/sidebar.ts')
}
],
}
Expand Down
Loading

0 comments on commit f62d02e

Please sign in to comment.