-
Notifications
You must be signed in to change notification settings - Fork 1
Test Scenarios Description
The Test Suite is organized around a series of Test Scenarios chaining a set of Test Cases.
The Test Scenarios are designed as a sequence of basic Test Cases covering one or several functionalities of the Target Site in order to reproduce a typical user operation on the target site.
The following table describes each Test Scenario vided in 2 groups:
- Local Test Scenarios are locally executed measuring metrics of the various functions from the local Test Site towards the Target Site. They are run using the cdab-client command line tool.
- Remote Test Scenarios are remotely executed on a virtual machines within the service providers' cloud infrastructure (when available) measuring metrics of the various functions directy within the Target Site. They are run using the cdab-remote-client command line tool.
Test Scenario 15 (TS15) covers several end-to-end scenarios which are independent of each other.
The following sections give a short overview about the simple test cases and how they can be configured and run.
This test performs multiple concurrent remote HTTP web requests to the front endpoint of the target site. It measures, among other metrics, the average and peak response times.
This test performs a simple filtered search (e.g. by mission or product type) and verfies whether the results match the specified search criteria. The test client sends multiple concurrent remote HTTP web requests to the front OpenSearch API of the target site using the OpenSearch mechanism to query and retrieve the search results. Searches are limited to simple filters (no spatial nor time filters) established randomly on the missions dictionary.
Among the obtained metrics are the average and peak response times, the number of results and the size of the responses.
This test performs a more complex filtered search (e.g. by geometry, acquisition period or ingestion date) and verifies whether the results match the specified search criteria. The test client sends multiple concurrent remote HTTP web requests are sent to the front catalogue search API (preferably OpenSearch API) of the target site using the search mechanism to query and retrieve the search results. N queries are prepared with all filters (spatial and time filters included) and composed with random filters from the missions dictionary.
The obtained metrics are the same as in TC201.
This test performs a specific filtered search (e.g. geometry, acquisition period, ingestion date) with many results pages and verfies whether the results match the specified search criteria. The test client sends multiple concurrent remote HTTP web requests to the front OpenSearch API of the target site using the OpenSearch mechanism to query and retrieve the search results over many results pages. The search filters are fixed (a moving window in time).
The obtained metrics are the same as in TC201.
This test performs a simple filtered search (e.g. by mission or product type) for querying only offline data and verifies whether the results match the specified search criteria. The test client sends multiple concurrent remote HTTP web requests are sent to the front catalogue search API (preferably OpenSearch API) of the target site using the search mechanism to query and retrieve the search results. N queries are prepared with simple filters (no spatial nor time filters) plus a specific filter to select offline data only. They are composed with random filters from the missions dictionary.
The obtained metrics are the same as in TC201.
This test is the remote version of TC201. Its results are derived from executing TC201 from a virtual machine on the target provider's cloud (if the provider offers processing infrastructure).
This test is the remote version of TC202. Its results are derived from executing TC202 from a virtual machine on the target provider's cloud (if the provider offers processing infrastructure).
This test evaluates the download service of the target site for online data. The test client makes a single remote download request to retrieve a product file via a product URL.
Among the metrics the test obtains is the throughput of the downloaded data.
This test evaluates the download capacity of the target site for online data using it maximum concurrent download capacity. It is the same as TC301 with as many concurrent download as the configured maximum allows.
The obtained metrics are the same as in TC301.
This test evaulates the download capacity of the target site for downloading data in bulk. It is the same as TC301 with as many download as the systematic search (TC213) returned.
The obtained metrics are the same as in TC301.
This test evaluates the capacity of the target site for downloading offline data. The test client sends multiple concurrent remote HTTP web requests to retrieve one or several product files from a set of selected URLs that are pointing to offline data.
The obtained metrics are the same as in TC301 and also include the latency for the availability of offline products.
This test is the remote version of TC301. Its results are derived from executing TC301 from a virtual machine on the target provider's cloud (if the provider offers processing infrastructure).
This test is the remote version of TC302. Its results are derived from executing TC302 from a virtual machine on the target provider's cloud (if the provider offers processing infrastructure).
This test measures the cloud services capacity of the target site for provisioning a single virtual machine. The test client sends a remote web request using the cloud services API of the target site to request a typical virtual machine. Once the machine is ready, the test client executes a command within a docker container to start TC211 and TC311.
The obtained metrics include the provisioning latency and and the process duration as well as information about the virtual machine configuration and related costs.
This test measures the cloud services capacity of the target site for provisioning multiple virtual machines. The test client sends remote web requests using the cloud services API of the target site to request N typical virtual machines. Once a machine is ready, the test client executes a command within a docker container to start TC212 and TC312.
The obtained metrics are the same as in TC411.
This test measures the cloud services capacity of the target site for provisioning virtual machines with the capability of running data-transforming algorithms. The test client sends a remote web requests using the cloud services API of the target site to request a typical virtual machine. Once the machine is ready, the test client executes a command within a docker container to download a test product and run an algorithm to produce one or more outputs from it.
The obtained metrics are the same as in TC411.
This tests evaluates the cloud services capacity of the target site for provisioning virtual machines with the capability of running data-transforming pre-defined data-transforming algorithms in typical EO applications. The test client sends a request using the cloud services API of the target site to request a typical virtual machine. Once the machine is ready, the test case stages in its input data, executes the application matching the end-to-end scenario and compiles information about the success of the execution and related metrics.
Note: This test case is unlike the others because it runs different tests based on the end-to-end scenario use case for which it is called. They all are part of TS15, but cover very different applications and need to be considered separately and compared only between the same end-to-end scenario. The end-to-end scenario used is specified by its numeric ID, so for use case 1 (end-to-end scenario 1) the scenario ID would be TS15.1
The obtained metrics are the same as in TC411.
This test case evaluates the catalogue coverage of a target site by collection. The test client sends multiple concurrent catalogue requests to retrieve the total number of products for all the possible combinations of filters in configuration input. When timeliness is applicable on a collection, the search excludes the time critical items (e.g. NRT, STC)
The obtained metrics include information about the number of results and coverage percentage.
This test case evaluates the local data coverage of a target site for all product types collection. The test client sends multiple concurrent catalogue requests to retrieve the total number of online and offline products for all the possible product types.
The obtained metrics are the same as in TC501.
This test evaluates the local data consistency of a target site by data offer collection. The test client sends multiple concurrent catalogue requests to retrieve the total number of online and offline products for all the possible product types.
The obtained metrics are the same as in TC501.
This test evaluates the data latency of a target site by collection. The test client sens multiple concurrent catalogue requests to retrieve the latest products per collection and compare their data publication time to the sensing time. A timeliness is applied on a collection when applicable to limit the search to the time critical items (e.g. NRT, STC).
The obtained metrics include the average and maximum data operational latency and information about result quality.
This test evaluates the availability data latency of a target site by collection with respect to a reference target site. The test client sends multiple concurrent catalogue requests to retrieve the latest products per collection and compare their data publication time to the sensing time. When a timeliness is applicable on a collection, searches are excluding the time critical items (e.g. NRT, STC).
The obtained metrics include the average and maximum data availability latency and information about result quality.
This test evaluates the upload performance on a target site’s storage. The test client randomly generates a large file uploads it to a newly created storage.
The obtained metrics include the data throughput, i.e. upload speed.
Test and evaluate the upload performance on a target site’s storage The test client randomly generates a large file uploads it to a newly created storage. Typical transmission performance metrics are recorded during the process.
The test client downloads the file uploaded in TC701 from the storage which is deleted upon completion.
The obtained metrics include the data throughput, i.e. download speed.
In the Test Scenarios TS01 and TS02, the test cases TC201 and TC202 perform a set of typical queries on the catalogue API of the target sites. During the preparation phase of each test case, a function generates randomly those typical queries using the Mission Configuration Dictionary. It performs the following pseudo-algorithm:
- picks up randomly a mission defined in the Target Site Baseline Sets
- select randomly a collection in the previously selected sets
- select randomly some filters applicable to the collection in the Mission Configuration Dictionary
- for each selected filters, generate randomly a filter value according to the ranges or options defined in the dictionary (and in the World Borders Dataset for the complex)
- combine all filters in a single query
All steps are performed N times for the N queries to be performed in the test cases according to the load factor. Here are some examples of simple and complex randomly generated queries:
- "Mission Sentinel-1 Track [16 TO 169]",
- "Mission Sentinel-2 Count 86 Level-1C",
- "Mission Sentinel-1 Count 62 Stripmap",
- "Mission Sentinel-2 A Level-2A",
- "Mission Sentinel-3 B Near Real Time",
- "Mission Sentinel-2 A",
- "Mission Sentinel-3 Near Real Time Track [330 TO 355]",
- "Mission Sentinel-1 Count 47 Level-2 Ocean (OCN)",
- "Mission Sentinel-1 Count 73 B",
- "Mission Sentinel-3 Count 27 A"
- "Mission Sentinel-2 Level-2A From Sunday, October 22, 2017 To Thursday, June 13, 2019 intersecting United Republic of Tanzania",
- "Mission Sentinel-3 Count 18 From Sunday, October 22, 2017 intersecting Cape Verde",
- "Mission Sentinel-1 Count 13 From Sunday, January 22, 2017 To Tuesday, June 27, 2017 intersecting Cook Islands",
- "Mission Sentinel-3 A To Saturday, April 15, 2017 intersecting Jamaica"
For the spatial filtering, the benchmark client builds a WKT geometry filter from a configured shapefile (by default worls administrative borders, see global configuration). Basically, the geometry is first reduced to a precision model at 0.1 degree then simplified using Douglas-Peucker algorithm but down to ~10km accuracy preserving more initial topology. This method makes the geometry filter coarser than the original but it is still representative of a user request.
“Count” means the number of items requested for a query. This is also randomized.
Each results returned during the main query phase of the test case is validated against every filter included in the search. For instance, each results of the search Mission Sentinel-2 Level-2A To Thursday, June 13, 2019 intersecting United Republic of Tanzania
is verified that it is
- a Sentinel-2
- a Level-2A product type
- acquired after Sunday, October 22, 2017 To Thursday June 13, 2019
- intersecting Tanzania border geometry
This randomization & validation process combined with a fairly significant load factor and the test scenarios frequency ensures
- to query each time a new search to the target site catalogue and thus benchmark properly the search performance (no cache).
- to benchmark properly the search functionality of each target site either on the filter capacity and on the results accuracy
- to ensure a fair distribution of the search tests among the target sites
Concerning the validation of the accuracy of the results footprint against the geometry filter, the method also takes into account the data context of the target sites benchmarked. Indeed, the catalogues contain very large collections of low resolution product. This often leads the catalogue service to use a spatial index technology with a coarser accuracy to favor query speed. Thus, the validation method extends the validating geometry with a spatial buffer of 100km allowing close results to be validated.
Offline download strategy (TS04)
Offline download scenario aims at measuring the latency between the ordering of an offline data and its availability for direct download. To achieve that benchmark, a specific search for datasets marked as offline in the target site catalogue is first performed. Then a product order is placed using the API of the target site. The requested offline datasets are kept in a state file. Next runs of the scenario try to retrieve the ordered dataset when ready. The time span between the dataset order and the actual file download is the metric used to benchmark the offline availability latency.
Data Coverage Benchmark Strategy (TS05)
In test scenario TS05, test case TC501 aims at retrieving metrics about the catalogue coverage of a target site. This test case uses the Common Baseline Sets referenced by the target site and are combined to form collections representing the local data collections.
Every combination is called a Data Collection Division. A search filter is generated accordingly. For instance, Sentinel-1 Level-1 Single Look Complex (SLC)
.
At execution runtime, the test case TC501 will be executed systematically on every Data Collection Division search requesting no result items but only the total number of results.
Test case TC502 performs the same kind of request but requesting only the number of online data. Every Baseline Collection reports then the online/offline storage rate.
The test case results are reported in arrays of metrics sized to the number of data Collection Division. It is not possible to validate every item of the query thus the total results number returned by the search is assumed validated.
Test case TC503 performs a similar test as per TC502 on collections defined in the Target Site Local Sets. This test case intent to measure how consistent is the online data locally managed by the service provider wrt to the “declared” data offer.
All TC50X test cases results are reported in arrays of metrics sized to the number of data Collection Division. It is not possible to validate every item of the query thus the total results number returned by the search is assumed validated.
Data Latency Benchmark Strategy (TS06)
In test scenario TS06, test cases TC601 and TC602 aims at retrieving metrics about the data publication latency in the target sites catalogue.
Test case TC601 uses the Copernicus Product Types defined in the Common Baseline Sets to generate a set of queries in order to retrieve the latest 100 results of the catalogue for every product type. This test case aims at representing the latency of availability of the most recent and critical products with regards to the sensing time. It is therefore necessary to either select the time critical product or to discard the reprocessed data. So to ensure this time criticality filtering, the following strategy is performed:
- When timeliness filter is available at target site and allows to select only time critical product, the filter is activated. Setting the timeliness to time critical products enable a better benchmarking of the operational data latency.
- If no timeliness filter is applicable, a sensing time filter is applied to a time span covering 4 * the maximum expected timeliness of the product type as described in COPE-GSEG-EOPG-PD-14-0017.
- By default, a sensing time filter of 40 days is applied
As a result, the following queries are the typical search made at the APIHUB:
- "Sentinel-1 Level-0 SAR raw data (RAW) last 4 days",
- "Sentinel-1 Level-1 Ground Range Detected (GRD) last 4 days",
- "Sentinel-1 Level-2 Ocean (OCN) last 12 hours",
- "Sentinel-1 Level-1 Single Look Complex (SLC) last 4 days",
- "Sentinel-2 Level-1C last 12 hours",
- "Sentinel-2 Level-2A last 4 days",
- "Sentinel-3 OLCI Level-1B EO FR last 12 hours",
- "Sentinel-3 OLCI Level-1B EO RR last 12 hours",
- "Sentinel-3 OLCI Level-2 Land RR last 12 hours",
- "Sentinel-3 OLCI Level-2 Land FR last 12 hours",
- "Sentinel-3 SLSTR Level-1B RBT last 12 hours",
- "Sentinel-3 SLSTR Level-2 Land Surface Temp last 12 hours",
- "Sentinel-3 Altimetry Level-1 SRA Near Real Time",
- "Sentinel-3 Altimetry Level-1 SRA Short Time Critical",
- "Sentinel-3 Altimetry Level-1 SRA_A Short Time Critical",
- "Sentinel-3 Altimetry Level-1 SRA_BS Short Time Critical",
- "Sentinel-3 Altimetry Level-2 Land Near Real Time",
- "Sentinel-3 Altimetry Level-2 Land Short Time Critical",
- "Sentinel-3 Altimetry Level-2 Water Short Time Critical",
- "Sentinel-3 Synergy Level-2 Surface Reflectance Short Time Critical"
Test case TC602 uses the Copernicus Product Types defined in the Common Baseline Sets referenced by the target site to generate a set of queries in order to retrieve the latest 20 results of the catalogue for every product type. This test case aims at representing the latency of availability of the most recent and critical products with regards to the Reference Target Site (by default APIHUB). It is desired to discard the time critical product to eliminate the time critical results that would be kept by a target site and not by the reference site.
As a result, the following typical queries are generated:
- "Sentinel-1 Level-0 SAR raw data (RAW)",
- "Sentinel-1 Level-1 Ground Range Detected (GRD)",
- "Sentinel-1 Level-2 Ocean (OCN)",
- "Sentinel-1 Level-1 Single Look Complex (SLC)",
- "Sentinel-2 Level-1C",
- "Sentinel-2 Level-2A",
- "Sentinel-3 OLCI Level-1B EO FR",
- "Sentinel-3 OLCI Level-1B EO RR",
- "Sentinel-3 OLCI Level-2 Land RR",
- "Sentinel-3 OLCI Level-2 Land FR",
- "Sentinel-3 SLSTR Level-1B RBT",
- "Sentinel-3 SLSTR Level-2 Land Surface Temp",
- "Sentinel-3 Altimetry Level-1 SRA Non Time Critical",
- "Sentinel-3 Altimetry Level-1 SRA_A Non Time Critical",
- "Sentinel-3 Altimetry Level-1 SRA_BS Non Time Critical",
- "Sentinel-3 Altimetry Level-2 Land Non Time Critical",
- "Sentinel-3 Altimetry Level-2 Water Non Time Critical",
- "Sentinel-3 Synergy Level-2 Surface Reflectance Non Time Critical"
Every results item is then queried on the reference target site and the 2 items’ publication time is compared to extract the availability latency.
If the target site does not provide a method to get the publication date of products, the date when the query is performed is used instead as an approximate replacement.
Virtual Machines Provisioning Benchmark Strategy (TS1X)
In test scenarios TS1X, all test cases are executed on remote hosted virtual machines provisioned on the infrastructure of the target site (which can be a DIAS or a third-party provider, such as Amazon AWS or Google Cloud Platform). Cdab-remote software, using the Cloud Services API, requests one or more virtual machines. Test cases TC41X monitor the provisioning sequence by measuring the time spans for the requested virtual machines to be up and ready for access. For each target site, a virtual machine image and flavour are chosen and the following execution steps are performed to prepare and execute the remote test. It is also possible to create several virtual machines based on different flavours at the same time and run tests in parallel.
- Install docker and start it;
- Install CDAB docker image or another docker image for processing-related tests;
- Run underlying test via cdab-client or the specific data transforming algorithm;
- Retrieve results from underlying tests
Once all steps are performed, the virtual machine is shut down. The results files are amended with the results of the test case TC41X.