Implement testing for Prefect workflow #17

amcnicho · 2024-12-10T01:46:09Z

Objective

Define, measure, and improve the reliability and fault tolerance of an example workflow based on Prefect.

Incorporate entities that facilitate timely and accurate failure detection.
An ideal rollback recovery approach would not require source code modifications, source code recompilation, or relinking support binaries.
Recovery rollback should include robust failure detection that activates without user intervention.
Time to create checkpoints should be significantly shorter than the application runtime and the checkpoint size should be small.

Note: This is essentially a sub-task of #8

The team has implemented tests that quantify the reliability and fault tolerance of the example Prefect workflow
The team has simulated failures in the operation of the example Prefect workflow to demonstrate the usefulness of the tests

Initial definitions of reliability and fault tolerance against which to implement tests for monitoring.
Test system and associated CI capabilities
Passing tests that function as the basis for a monitoring system of workflows based on Prefect.

There are established definitions and initial measurements that quantify the reliability and fault tolerance of workflows based on Prefect.

Commonly used monitoring signals (latency, traffic, errors, saturation, time-to-recovery) might be difficult to quantify using workflows that only mock behavior of domain applications, i.e., sleep functions on a instead of actual workloads.
Appropriate measurement resolution still undefined without knowing the details of integration with other services (such as user interfaces, resource pools, user demand).
Without a well understood model of real incidents that might occur in a future working system, simulated failures might provide unrealistic constraints on the development of example workflows.

amcnicho added the Workflow label Dec 10, 2024

krlberry mentioned this issue Dec 10, 2024

Create a presentation for the Prefect workflow #14

Open

4 tasks