Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: update docs and reset tool for checksums #5372

Merged
merged 5 commits into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 20 additions & 22 deletions Docs/source/developers/checksum.rst
Original file line number Diff line number Diff line change
@@ -1,32 +1,33 @@
.. _developers-checksum:

Checksum regression tests
=========================
Using checksums
===============
ax3l marked this conversation as resolved.
Show resolved Hide resolved

WarpX has checksum regression tests: as part of CI testing, when running a given test, the checksum module computes one aggregated number per field (``Ex_checksum = np.sum(np.abs(Ex))``) and compares it to a reference (benchmark). This should be sensitive enough to make the test fail if your PR causes a significant difference, print meaningful error messages, and give you a chance to fix a bug or reset the benchmark if needed.
When running an automated test, the checksum module computes one aggregated number per field (e.g., the sum of the absolute values of the array elements) and compares it to a reference value (benchmark).
ax3l marked this conversation as resolved.
Show resolved Hide resolved
This should be sensitive enough to make the test fail if your PR causes a significant difference, print meaningful error messages, and give you a chance to fix a bug or reset the benchmark if needed.

The checksum module is located in ``Regression/Checksum/``, and the benchmarks are stored as human-readable `JSON <https://www.json.org/json-en.html>`__ files in ``Regression/Checksum/benchmarks_json/``, with one file per benchmark (for instance, test ``Langmuir_2d`` has a corresponding benchmark ``Regression/Checksum/benchmarks_json/Langmuir_2d.json``).
The checksum module is located in ``Regression/Checksum/``, and the benchmarks are stored as human-readable `JSON <https://www.json.org/json-en.html>`__ files in ``Regression/Checksum/benchmarks_json/``, with one file per benchmark (for example, the test ``test_2d_langmuir_multi`` has a corresponding benchmark ``Regression/Checksum/benchmarks_json/test_2d_langmuir_multi.json``).

For more details on the implementation, the Python files in ``Regression/Checksum/`` should be well documented.
For more details on the implementation, please refer to the Python implementation in ``Regression/Checksum/``.

From a user point of view, you should only need to use ``checksumAPI.py``. It contains Python functions that can be imported and used from an analysis Python script. It can also be executed directly as a Python script. Here are recipes for the main tasks related to checksum regression tests in WarpX CI.
From a user point of view, you should only need to use ``checksumAPI.py``, which contains Python functions that can be imported and used from an analysis Python script or can also be executed directly as a Python script.

Include a checksum regression test in an analysis Python script
---------------------------------------------------------------
How to compare checksums in your analysis script
------------------------------------------------

This relies on the function ``evaluate_checksum``:

.. autofunction:: checksumAPI.evaluate_checksum

For an example, see
Here's an example:

.. literalinclude:: ../../../Examples/analysis_default_regression.py
.. literalinclude:: ../../../Examples/Tests/embedded_circle/analysis.py
:language: python

This can also be included in an existing analysis script. Note that the plotfile must be ``<test name>_plt?????``, as is generated by the CI framework.
This can also be included as part of an existing analysis script.

Evaluate a checksum regression test from a bash terminal
--------------------------------------------------------
How to evaluate checksums from the command line
-----------------------------------------------

You can execute ``checksumAPI.py`` as a Python script for that, and pass the plotfile that you want to evaluate, as well as the test name (so the script knows which benchmark to compare it to).

Expand All @@ -41,11 +42,8 @@ See additional options
* ``--rtol`` relative tolerance for the comparison
* ``--atol`` absolute tolerance for the comparison (a sum of both is used by ``numpy.isclose()``)

Create/Reset a benchmark with new values that you know are correct
------------------------------------------------------------------

Create/Reset a benchmark from a plotfile generated locally
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
How to create or reset checksums with local benchmark values
------------------------------------------------------------

This is using ``checksumAPI.py`` as a Python script.

Expand All @@ -65,8 +63,8 @@ Since this will automatically change the JSON file stored on the repo, make a se
git add <test name>.json
git commit -m "reset benchmark for <test name> because ..." --author="Tools <[email protected]>"

Automated reset of a list of test benchmarks
--------------------------------------------
How to reset checksums for a list of tests with local benchmark values
----------------------------------------------------------------------

If you set the environment variable ``export CHECKSUM_RESET=ON`` before running tests that are compared against existing benchmarks, the test analysis will reset the benchmarks to the new values, skipping the comparison.

Expand All @@ -80,8 +78,8 @@ With `CTest <https://cmake.org/cmake/help/latest/manual/ctest.1.html>`__ (coming
# ... check and commit changes ...


Reset a benchmark from the Azure pipeline output on Github
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
How to reset checksums for a list of tests with benchmark values from the Azure pipeline output
-----------------------------------------------------------------------------------------------

Alternatively, the benchmarks can be reset using the output of the Azure continuous intergration (CI) tests on Github. The output can be accessed by following the steps below:

Expand Down
47 changes: 33 additions & 14 deletions Docs/source/developers/testing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,33 +3,43 @@
Testing the code
================

When adding a new feature, you want to make sure that (i) you did not break the existing code and (ii) your contribution gives correct results. While the code is tested regularly remotely (on the cloud when commits are pushed to an open PR, and every night on local clusters), it can also be useful to run tests on your custom input file. This section details how to use both automated and custom tests.
When proposing a code change, you want to make sure that

Continuous Integration in WarpX
-------------------------------
* the code change does not break the existing code;
* the code change gives correct results (numerics, physics, etc.).

Configuration
^^^^^^^^^^^^^
WarpX follows the continuous integration (CI) software development practice, where automated builds and tests are run after merging code changes into the main branch.

Our regression tests are run with `CTest <https://cmake.org/cmake/help/book/mastering-cmake/chapter/Testing%20With%20CMake%20and%20CTest.html#>`__, an executable that comes with CMake.

The test suite is ready to run once you have configured and built WarpX with CMake, following the instructions that you find in our :ref:`Users <install-cmake>` or :ref:`Developers <building-cmake>` sections.

A test that requires a build option that was not configured and built will be skipped automatically. For example, if you configure and build WarpX in 1D only, any test of dimensionality other than 1D, which would require WarpX to be configured and built in the corresponding dimensionality, will be skipped automatically.
While the code is tested regularly remotely (on the cloud when commits are pushed to an open PR, and every night on local clusters), it can also be useful to run tests on your custom input file.

How to run pre-commit tests locally
-----------------------------------

When proposing code changes to Warpx, we perform a couple of automated stylistic and correctness checks on the code change.
You can run those locally before you push to save some time, install them once like this:
First, when proposing a code change, we perform a couple of automated style and correctness checks.

If you install the ``pre-commit`` tool on your local machine via

.. code-block:: sh

python -m pip install -U pre-commit
pre-commit install

the style and correctness checks will run automatically on your local machine, after you commit the change and before you push.

If you do not install the ``pre-commit`` tool on your local machine, these checks will run automatically as part of our CI workflows and a commit containing style and correctness changes might be added automatically to your branch.
In that case, you will need to pull that automated commit before pushing further changes.

See `pre-commit.com <https://pre-commit.com>`__ and our ``.pre-commit-config.yaml`` file in the repository for more details.

How to configure the automated tests
------------------------------------

Our regression tests are run with `CTest <https://cmake.org/cmake/help/book/mastering-cmake/chapter/Testing%20With%20CMake%20and%20CTest.html#>`__, an executable that comes with CMake.

The test suite is ready to run once you have configured and built WarpX with CMake, following the instructions that you find in our :ref:`Users <install-cmake>` or :ref:`Developers <building-cmake>` sections.

A test that requires a build option that was not configured and built will be skipped automatically. For example, if you configure and build WarpX in 1D only, any test of dimensionality other than 1D, which would require WarpX to be configured and built in the corresponding dimensionality, will be skipped automatically.

How to run automated tests locally
----------------------------------

Expand Down Expand Up @@ -107,7 +117,15 @@ If you modify the code base locally and want to assess the effects of your code
How to add automated tests
--------------------------

As mentioned above, the input files and scripts used by the automated tests can be found in the `Examples <https://github.com/ECP-WarpX/WarpX/tree/development/Examples>`__ directory, either under `Physics_applications <https://github.com/ECP-WarpX/WarpX/tree/development/Examples/Physics_applications>`__ or `Tests <https://github.com/ECP-WarpX/WarpX/tree/development/Examples/Tests>`__.
An automated test typically consists of the following components:

* input file or PICMI input script;
* analysis script;
* checksum file.

To learn more about how to use checksums in automated tests, please see the corresponding section :ref:`Using checksums <developers-checksum>`.

As mentioned above, the input files and scripts used by the automated tests can be found in the `Examples <https://github.com/ECP-WarpX/WarpX/tree/development/Examples>`__ directory, under either `Physics_applications <https://github.com/ECP-WarpX/WarpX/tree/development/Examples/Physics_applications>`__ or `Tests <https://github.com/ECP-WarpX/WarpX/tree/development/Examples/Tests>`__.

Each test directory must contain a file named ``CMakeLists.txt`` where all tests associated with the input files and scripts in that directory must be listed.

Expand Down Expand Up @@ -173,7 +191,8 @@ A new test can be added by adding a corresponding entry in ``CMakeLists.txt`` as

If you need a new Python package dependency for testing, please add it in `Regression/requirements.txt <https://github.com/ECP-WarpX/WarpX/blob/development/Regression/requirements.txt>`__.

Sometimes two or more tests share a large number of input parameters. The shared input parameters can be collected in a "base" input file that can be passed as a runtime parameter in the actual test input files through the parameter ``FILE``.
Sometimes two or more tests share a large number of input parameters.
The shared input parameters can be collected in a "base" input file that can be passed as a runtime parameter in the actual test input files through the parameter ``FILE``.

If the new test is added in a new directory that did not exist before, please add the name of that directory with the command ``add_subdirectory`` in `Physics_applications/CMakeLists.txt <https://github.com/ECP-WarpX/WarpX/tree/development/Examples/Physics_applications/CMakeLists.txt>`__ or `Tests/CMakeLists.txt <https://github.com/ECP-WarpX/WarpX/tree/development/Examples/Tests/CMakeLists.txt>`__, depending on where the new test directory is located.

Expand Down
4 changes: 2 additions & 2 deletions Docs/source/developers/workflows.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Workflows

profiling
testing
documentation
checksum
local_compile
run_clang_tidy_locally
local_compile
documentation
EZoni marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
"Bx": 0.0,
"By": 5.726296856755232,
"Bz": 0.0,
"Ex": 3751589134191.326,
"Ex": 3.751589134191326,
EZoni marked this conversation as resolved.
Show resolved Hide resolved
"Ey": 0.0,
"Ez": 3751589134191.332,
"Ez": 3.751589134191332,
EZoni marked this conversation as resolved.
Show resolved Hide resolved
EZoni marked this conversation as resolved.
Show resolved Hide resolved
"jx": 1.0100623329922576e+16,
"jy": 0.0,
"jz": 1.0100623329922578e+16
Expand Down
83 changes: 36 additions & 47 deletions Tools/DevUtils/update_benchmarks_from_azure_output.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2023 Neil Zaim
# Copyright 2023 Neil Zaim, Edoardo Zoni
#
# This file is part of WarpX.
#
Expand All @@ -9,56 +9,45 @@
import sys

"""
This Python script updates the Azure benchmarks automatically using a raw Azure output textfile
that is given as the first and only argument of the script.

In the Azure output, we read the lines contained between
"New file for Test_Name:"
and the next occurrence of
"'----------------'"
And use these lines to update the benchmarks
This Python script updates the Azure benchmarks automatically using a raw
Azure output text file that is passed as command line argument of the script.
"""

azure_output_filename = sys.argv[1]
# read path to Azure output text file
azure_output = sys.argv[1]

pattern_test_name = "New file for (?P<testname>[\w\-]*)"
closing_string = "----------------"
benchmark_path = "../../Regression/Checksum/benchmarks_json/"
benchmark_suffix = ".json"
# string to identify failing tests that require a checksums reset
new_checksums = "New checksums"
failing_test = ""

first_line_read = False
current_test = ""
# path of all checksums benchmark files
benchmark_path = "../../Regression/Checksum/benchmarks_json/"

with open(azure_output_filename, "r") as f:
with open(azure_output, "r") as f:
# find length of Azure prefix to be removed from each line,
# first line of Azure output starts with "##[section]Starting:"
first_line = f.readline()
prefix_length = first_line.find("#")
# loop over lines
for line in f:
if current_test == "":
# Here we search lines that read, for example,
# "New file for LaserAcceleration_BTD"
# and we set current_test = "LaserAcceleration_BTD"
match_test_name = re.search(pattern_test_name, line)
if match_test_name:
current_test = match_test_name.group("testname")
new_file_string = ""

# remove Azure prefix from line
line = line[prefix_length:]
if failing_test == "":
# no failing test found yet
if re.search(new_checksums, line):
# failing test found, set failing test name
failing_test = line[line.find("test_") : line.find(".json")]
json_file_string = ""
else:
# We add each line to the new file string until we find the line containing
# "----------------"
# which indicates that we have read the new file entirely

if closing_string not in line:
if not first_line_read:
# Raw Azure output comes with a prefix at the beginning of each line that we do
# not need here. The first line that we will read is the prefix followed by the
# "{" character, so we determine how long the prefix is by finding the last
# occurrence of the "{" character in this line.
azure_indent = line.rfind("{")
first_line_read = True
new_file_string += line[azure_indent:]

else:
# We have read the new file entirely. Dump it in the json file.
new_file_json = json.loads(new_file_string)
json_filepath = benchmark_path + current_test + benchmark_suffix
with open(json_filepath, "w") as f_json:
json.dump(new_file_json, f_json, indent=2)
current_test = ""
# extract and dump new checksums of failing test
json_file_string += line
if line.startswith("}"): # end of new checksums
json_file = json.loads(json_file_string)
json_filename = failing_test + ".json"
json_filepath = benchmark_path + json_filename
print(f"\nDumping new checksums file {json_filename}:")
print(json_file_string)
with open(json_filepath, "w") as json_f:
json.dump(json_file, json_f, indent=2)
# reset to empty string to continue search of failing tests
failing_test = ""
Loading