You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.
Because with the current format we cannot make sure, that we correctly parse and/or URL encode the user defined queries. Also we want to move away from magic string prefixes e.g. MV2;Microsecond;metricSelector=..., that are used for encoding the final query.
These are the endpoints that are currently supported, but others will/might be added in the future.
Metrics V2
/api/v2/metrics/query
query parameters:
metricSelector (mandatory, String)
entitySelector (optional, String)
query parameters that cannot be used:
resolution (will be set to Inf by dynatrace-service)
from (will be set internally from event data)
to (will be set internally from event data)
path parameters:
none
USQL V1
/api/v1/userSessionQueryLanguage/table
query parameters:
query (mandatory, String)
query parameters that cannot be used:
explain (will be set to false internally)
addDeepLinkFields (will be set to false internally)
pageOffset and pageSize (will not be used)
startTimestamp (will be set internally from event data)
endTimestamp (will be set internally from event data)
path parameters:
none
additional Dynatrace Service Parameters from magic strings:
USQL;TILE_TYPE;DIMENSION;<query>
TILE_TYPE (can have values of SINGLE_VALUE, PIE_CHART, COLUMN_CHART, TABLE, determines the position of the value in the row)
DIMENSION (name of the dimension, that will be matched on the header of the row)
This behaviour is necessary for Dashboards, but only queries that produce a single result are allowed for SLI files - which is not the case at the moment!
Problem V2
/api/v2/problems
query parameters:
problemSelector (mandatory, String)
entitySelector (optional, String)
query parameters that cannot be used:
sort (will be set internally, if at all)
fields (will be set internally, if at all)
from (will be set internally from event data)
to (will be set internally from event data)
nextPageKey and pageSize (will not be used)
path parameters:
none
SecurityProblem V2
/api/v2/securityProblems
query parameters:
securityProblemSelector (mandatory, String)
query parameters that cannot be used:
sort (will be set internally, if at all)
fields (will be set internally, if at all)
from (will be set internally from event data)
to (will be set internally from event data)
nextPageKey and pageSize (will not be used)
path parameters:
none
SLO
/api/v2/slo/{id}
query parameter:
none
query parameters that cannot be used:
from (will be set internally from event data)
to (will be set internally from event data)
timeFrame (will be set internally)
path parameters:
id (mandatory, String)
Approach 1: Using a structured object instead of a magic string
We would propose a new format that allows more flexibility on the SLI provider side:
Because we are relying on Dynatrace APIs for all queries, there is no need to make this more extensible as necessary (e.g. how Kubernetes defines CRDs).
Below you will see the JSON Schema for the SLI file (The units are copied/pasted from Dynatrace Help pages - maybe not all of them should be taken or even others added? Please also note the two units: NotApplicable and Unspecified).
This means we can have total flexibility for all queries, as they can be either a simple string or they can be an object with at least one property. Each SLI provider can then specify the structure for these queries on their own.
The current format could easily be rewritten with the new format, although not adding too much value in this way. See the example below (for a single indicator because of brevity):
Now we could rewrite the above query with some more structure, that would be helpful for us, by adding properties that adhere to the Dynatrace Environment V2 API Metrics endpoint and our specification. See the example for the Metrics V2 API below:
sli.yaml (v2.0) with structure
---
apiVersion: '2.0'
tags:
- global tag for all indicators
indicators:
throughput:
displayName: Throughput (total service request count)
query:
endpoint: api/v2/metrics/query
metricSelector: "builtin:service.requestCount.total:merge(\"dt.entity.service\"):sum"
entitySelector: "type(SERVICE),tag(keptn_managed),tag(keptn_service:$SERVICE)"
unit: Count
tags:
- first tag for throughput
- other tag for throughput
PROs:
we can stick to a single sli.yaml file for multiple SLI definitions because the query basically only changed from string to object.
we are very clear about what we support and what we do not support
we can validate the correctness of the given API endpoints and parameters
our specification only changes if Dynatrace API changes, or if we want to support more features
CONs:
for each change in Dynatrace API or if we want to support further API endpoints, we must adapt our single file specification accordingly
all definitions for a single service must done in one single file, which can be a little cumbersome to read if you would have lots of SLI definitions per service
Approach 2: Use a Monaco like way of mapping directories to endpoints
Each subfolder in a dynatrace/sli - folder will be mapped to a Dynatrace API endpoint. Each file inside a subfolder will be considered a definition of a SLI, with the file name being the indicator name.
We will only support a subset of all available API endpoints and for each endpoint we will have a JSON Schema specifying the available parameters.
api/v2/metrics/query,
api/v2/problems,
api/v2/securityProblems,
api/v2/slo/{id},
api/v1/userSessionQueryLanguage/table
So here are two examples for folders:
dynatrace/sli/api/v2/metrics/query and
dynatrace/sli/api/v2/slo/524ca177-849b-3e8c-8175-42b93fbc33c5 where 524ca177-849b-3e8c-8175-42b93fbc33c5 would be an existing SLO ID
Here are the definitions for the Keptn SLI file format and for each of the above endpoints and other sub schemas:
Dynatrace Environment V2 Security Problems API JSON Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://github.com/keptn-contrib/dynatrace-service/spec/sli/api/v2/securityProblems",
"title": "Dynatrace Environment V2 Security Problems API",
"description": "This object defines the available parameters for the api/v2/securityProblems endpoint",
"type": "object",
"properties": {
"securityProblemSelector": {
"type": "string"
}
},
"required": [ "securityProblemSelector" ],
"additionalProperties": false
}
Dynatrace Environment V2 Service-Level Objectives API JSON Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://github.com/keptn-contrib/dynatrace-service/spec/sli/api/v2/slo",
"title": "Dynatrace Environment V2 Service-Level Objectives API",
"description": "This object defines the available parameters for the api/v2/slo endpoint",
"type": "object",
"properties": {
"id": {
"type": "string"
}
},
"required": [ "id" ],
"additionalProperties": false
}
Dynatrace Environment V1 RUM - User Sessions Table API JSON Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://github.com/keptn-contrib/dynatrace-service/spec/sli/api/v1/userSessionQueryLanguage/table",
"title": "Dynatrace Environment V1 RUM - User Sessions Table API",
"description": "This object defines the available parameters for the api/v1/userSessionQueryLanguage/table endpoint",
"type": "object",
"properties": {
"query": {
"type": "string"
}
},
"required": [ "query" ],
"additionalProperties": false
}
With this specification we can now create an example file, that would show the final result:
In a (project, stage or service) folder we would have the following sub folder: dynatrace/sli/api/v2/metrics/query and inside this folder there would be a single file (for brevity) called throughput.yaml. So for the Metrics V2 Query endpoint we would need to comply with this specification from above: https://github.com/keptn-contrib/dynatrace-service/spec/sli/api/v2/metrics/query.
throughput.yaml
---
apiVersion: '2.0'
displayName: Throughput (total service request count)
indicator:
metricSelector: "builtin:service.requestCount.total:merge(\"dt.entity.service\"):sum"
entitySelector: "type(SERVICE),tag(keptn_managed),tag(keptn_service:$SERVICE)"
unit: Count
tags:
- first tag for throughput
- other tag for throughput
PROs:
we can have multiple SLI definition files - one per each definition - which could be a little more readable if there are lots of definitions for a single service
we are very clear about what we support and what we do not support
if we want to add something, then we only have to add a new Schema for the new endpoint in its own file and the rest would not change
our specification only changes if Dynatrace API changes, or if we want to support more features
CONs:
for each change in Dynatrace API or if we want to support further API endpoints, we must adapt our specifications accordingly
we cannot stop users from having incorrect folder structures - this cannot be validated.
Remarks:
global (project, stage or service) metadata (here tags) for all SLIs beneath a certain level would need to be handled and collected by the SLI provider, because Keptn would (and should) not know anything about the folder structures and the file name.
E.g. we could store a file called metadata.yml that adheres to the above specification in the root SLI folder dynatrace/sli.
Approach 3.*: Be totally transparent on API endpoints and query parameters
Common features for both approaches:
no restrictions on parameters, parameter types or endpoints
a transformation (extraction) function (e.g. JSONPath) must be supplied to extract the desired data (a single number) from the HTTP response
if transformation works, great, if not, we will fail with an appropriate error
Here is the transparent (API agnostic) specification for a single indicator:
Dynatrace Service SLI JSON Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://github.com/keptn-contrib/dynatrace-service/spec/sli/v2.0",
"title": "Dynatrace API parameters",
"description": "This object defines the minimum set of parameters",
"type": "object",
"properties": {
"json_path": {
"type": "string"
}
},
"required": [ "json_path" ]
}
Please consider this HTTP response from Dynatrace Environment V2 Metrics Query API, given some available data and the queries in the examples below:
Dynatrace Enviromnent V2 API - metrics/query HTTP Response
same Keptn SLI definition structure as in Approach 1 → single file in a single folder
endpoint property will be used for querying - no matter what it would be - based on a base URL
given properties - if any - would be added as query parameters (no need for path parameters, as they must be added to the endpoint value) - no matter if they are allowed/defined
in case of e.g. the SLO API which uses path parameters, we would have an endpoint like api/v2/slo/524ca177-849b-3e8c-8175-42b93fbc33c5 instead of api/v2/slo/{id}, because we do no longer check for correctness of endpoints
With the specification from above and the one from Approach 1 you could have a sli.yaml file like this:
sli.yaml (v2.0) with structure
---
apiVersion: '2.0'
tags:
- global tag for all indicators
indicators:
response_time_p95:
displayName: Service response time (percentile 95)
query:
endpoint: api/v2/metrics/query
metricSelector: "builtin:service.response.time:merge(\"dt.entity.service\"):percentile(95)",
entitySelector: "type(SERVICE),tag(keptn_managed),tag(keptn_service:$SERVICE)",
json_path: "$.result[0].data[0].values[0]"
unit: MicroSecond
tags:
- first tag for response time
- other tag for response time
Approach 3.2
same Keptn SLI definition structure as in Approach 2 → multiple files in multiple folders
folder structure would be matched 1:1 to an endpoint - no matter what it would be - based on a base URL
given properties - if any - would be added as query parameters (no need for path parameters, as they are already given by the folder structure) - no matter if they are allowed/defined
With the specification from above and the one from Approach 2 you could have a file called service_response_time_p95.yaml in a folder called dynatrace/sli/api/v2/metrics/query
service_response_time_p95.yaml
---
apiVersion: '2.0'
displayName: Service response time (percentile 95)
indicator:
metricSelector: "builtin:service.response.time:merge(\"dt.entity.service\"):percentile(95)",
entitySelector: "type(SERVICE),tag(keptn_managed),tag(keptn_service:$SERVICE)",
json_path: "$.result[0].data[0].values[0]"
unit: MicroSecond
tags:
- some tag
PROs Variant 1:
we can have a single SLI definition file, (if we think that this option is better than having multiple files)
PROs Variant 2:
we can have multiple SLI definition files - one per each definition - which could be a little more readable if there are lots of definitions for a single service
Common PROs:
we do not have to change anything if e.g. Dynatrace API changes or further API endpoints would be added, this must be done by the user
we do not have to change anything as all Dynatrace API features would be available out of the box
we support everything that can be mapped to a Dynatrace API and where the extraction function would return a number value - if it makes sense or not.
Common CONs:
we cannot stop users from having incorrect endpoints (3.1) resp. folder structures (3.2) - we do not validate anything.
we cannot stop users from using wrong or undesirable parameters for APIs
In order to still use the from and to timestamps that are sent in the cloud event, we need to deviate from being totally transparent a little bit:
we maintain a list of available endpoints that would allow from and to timestamps
if the endpoint is in our list we would add from and to query parameters accordingly
TL;DR
A short version of the above
All approaches will have breaking changes for Keptn and possibly all other Keptn integrations relying on keptn/go-utils (KeptnBase::GetSLIConfiguration) - including dynatrace-service.
Approach 1 resp. Approach 3.1 will still store all SLI definitions at a certain project, stage or service level in a single file.
Approach 2 resp. Approach 3.2 will store all SLI definitions at a certain project, stage or service level in individual files - one for each SLI - and in specific folders mimicking Dynatrace API endpoints
Approaches 1 and 2 will do validation on input data and are restricted to a set of available (supported) Dynatrace APIs.
Approaches 3.1 and 3.2 will not do any validation on input data and will just (more or less) transparently forward this input data to the resulting Dynatrace API endpoints.
Food for thought
Does Keptn need to know how SLIs are specified and stored for a certain provider?
Probably not, because it will help with decoupling SLO from SLI
The only thing that Keptn needs to know and specify is the the payload for the sh.keptn.event.get-sli.finished event data. See the current specification.
Below you can see an adapted version of this specification, that incorporates our requirements. Please mind the added definitions for tags and the added propertytags to GetSSLIFinishedEventData as well as the added propertiestags, displayName and unit for SLIResult.
What would change for our approaches above if Keptn does not know anything about SLI specs?
Not much, possibly just the values for the id fields in some of the JSON Schemas. Also there is no need to define the property unit at SLI provider side then, because it is already defined in Keptn Spec, which is where it belongs.
So it will only reduce the schemas above but not add anything to it.
Can we go ahead with any new format or are there any blockers?
Currently we rely on keptn/go-utils for retrieving SLI definitions, but it would be easy to just implement that functionality on our own. We have done similar things for other files as well.
Can we mix Approaches 1 and 3.1 resp. Approaches 2 and 3.2?
If it would be easy to adapt the specifications in Approach 1 and 2 to check certain API endpoints but also allow a generic endpoint without any validation. So these approaches are not necessarily mutually exclusive.
Can we have multiple SLI files?
In order to separate concerns it would be great for customers to have the possiblity of maintaining multiple SLI files that focus on different domains or parts of the whole system. So they could partition one big SLI file into multiple smaller files based on their organizational setup:
domain
user group (PMs, DevOps, SREs, ...)
department
...
This requirement could only be met with Approach 1 (and 3.1) and will not work with Approach 2 (and 3.2)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Why do we need a new format?
Because with the current format we cannot make sure, that we correctly parse and/or URL encode the user defined queries. Also we want to move away from magic string prefixes e.g.
MV2;Microsecond;metricSelector=...
, that are used for encoding the final query.See #625 for more details.
Current Dynatrace API endpoints
These are the endpoints that are currently supported, but others will/might be added in the future.
Metrics V2
/api/v2/metrics/query
query parameters:
query parameters that cannot be used:
resolution
(will be set to Inf by dynatrace-service)from
(will be set internally from event data)to
(will be set internally from event data)path parameters:
USQL V1
/api/v1/userSessionQueryLanguage/table
query parameters:
query
(mandatory, String)query parameters that cannot be used:
explain
(will be set to false internally)addDeepLinkFields
(will be set to false internally)pageOffset
andpageSize
(will not be used)startTimestamp
(will be set internally from event data)endTimestamp
(will be set internally from event data)path parameters:
additional Dynatrace Service Parameters from magic strings:
USQL;TILE_TYPE;DIMENSION;<query>
TILE_TYPE
(can have values ofSINGLE_VALUE
,PIE_CHART
,COLUMN_CHART
,TABLE
, determines the position of the value in the row)DIMENSION
(name of the dimension, that will be matched on the header of the row)This behaviour is necessary for Dashboards, but only queries that produce a single result are allowed for SLI files - which is not the case at the moment!
Problem V2
/api/v2/problems
query parameters:
problemSelector
(mandatory, String)entitySelector
(optional, String)query parameters that cannot be used:
sort
(will be set internally, if at all)fields
(will be set internally, if at all)from
(will be set internally from event data)to
(will be set internally from event data)nextPageKey
andpageSize
(will not be used)path parameters:
SecurityProblem V2
/api/v2/securityProblems
query parameters:
securityProblemSelector
(mandatory, String)query parameters that cannot be used:
sort
(will be set internally, if at all)fields
(will be set internally, if at all)from
(will be set internally from event data)to
(will be set internally from event data)nextPageKey
andpageSize
(will not be used)path parameters:
SLO
/api/v2/slo/{id}
query parameter:
query parameters that cannot be used:
from
(will be set internally from event data)to
(will be set internally from event data)timeFrame
(will be set internally)path parameters:
id
(mandatory, String)Approach 1: Using a structured object instead of a magic string
The current (1.0) Keptn Service Level Indicator specification is defined as outlined below:
spec_version
is a string that is set to1.0
andindicators
is a map of SLI name to provider specific query, both stringsSee this example below:
sli.yaml
We would propose a new format that allows more flexibility on the SLI provider side:
Because we are relying on Dynatrace APIs for all queries, there is no need to make this more extensible as necessary (e.g. how Kubernetes defines CRDs).
Below you will see the JSON Schema for the SLI file (The units are copied/pasted from Dynatrace Help pages - maybe not all of them should be taken or even others added? Please also note the two units:
NotApplicable
andUnspecified
).Keptn SLI v2.0 JSON Schema
This means we can have total flexibility for all queries, as they can be either a simple string or they can be an object with at least one property. Each SLI provider can then specify the structure for these queries on their own.
The current format could easily be rewritten with the new format, although not adding too much value in this way. See the example below (for a single indicator because of brevity):
sli.yaml (v2.0)
With the information from the Dynatrace APIs (from above) we can also build our JSON schema specification for Dynatrace specific SLI queries:
Dynatrace Service SLI JSON Schema
Now we could rewrite the above query with some more structure, that would be helpful for us, by adding properties that adhere to the Dynatrace Environment V2 API Metrics endpoint and our specification. See the example for the Metrics V2 API below:
sli.yaml (v2.0) with structure
PROs:
sli.yaml
file for multiple SLI definitions because the query basically only changed from string to object.CONs:
Approach 2: Use a Monaco like way of mapping directories to endpoints
Each subfolder in a
dynatrace/sli
- folder will be mapped to a Dynatrace API endpoint. Each file inside a subfolder will be considered a definition of a SLI, with the file name being the indicator name.We will only support a subset of all available API endpoints and for each endpoint we will have a JSON Schema specifying the available parameters.
api/v2/metrics/query
,api/v2/problems
,api/v2/securityProblems
,api/v2/slo/{id}
,api/v1/userSessionQueryLanguage/table
So here are two examples for folders:
dynatrace/sli/api/v2/metrics/query
anddynatrace/sli/api/v2/slo/524ca177-849b-3e8c-8175-42b93fbc33c5
where524ca177-849b-3e8c-8175-42b93fbc33c5
would be an existing SLO IDHere are the definitions for the Keptn SLI file format and for each of the above endpoints and other sub schemas:
Keptn SLI Tags JSON Schema
Keptn Global (service, stage or project level) SLI meta data JSON Schema
Keptn SLI v2.0 JSON Schema
Dynatrace Environment V2 Metrics Query API JSON Schema
Dynatrace Environment V2 Problems API JSON Schema
Dynatrace Environment V2 Security Problems API JSON Schema
Dynatrace Environment V2 Service-Level Objectives API JSON Schema
Dynatrace Environment V1 RUM - User Sessions Table API JSON Schema
With this specification we can now create an example file, that would show the final result:
In a (project, stage or service) folder we would have the following sub folder:
dynatrace/sli/api/v2/metrics/query
and inside this folder there would be a single file (for brevity) calledthroughput.yaml
. So for the Metrics V2 Query endpoint we would need to comply with this specification from above:https://github.com/keptn-contrib/dynatrace-service/spec/sli/api/v2/metrics/query
.throughput.yaml
PROs:
CONs:
Remarks:
tags
) for all SLIs beneath a certain level would need to be handled and collected by the SLI provider, because Keptn would (and should) not know anything about the folder structures and the file name.E.g. we could store a file called
metadata.yml
that adheres to the above specification in the root SLI folderdynatrace/sli
.Approach 3.*: Be totally transparent on API endpoints and query parameters
Common features for both approaches:
Here is the transparent (API agnostic) specification for a single indicator:
Dynatrace Service SLI JSON Schema
Please consider this HTTP response from Dynatrace Environment V2 Metrics Query API, given some available data and the queries in the examples below:
Dynatrace Enviromnent V2 API - metrics/query HTTP Response
Approach 3.1
api/v2/slo/524ca177-849b-3e8c-8175-42b93fbc33c5
instead ofapi/v2/slo/{id}
, because we do no longer check for correctness of endpointsWith the specification from above and the one from Approach 1 you could have a
sli.yaml
file like this:sli.yaml (v2.0) with structure
Approach 3.2
With the specification from above and the one from Approach 2 you could have a file called
service_response_time_p95.yaml
in a folder calleddynatrace/sli/api/v2/metrics/query
service_response_time_p95.yaml
PROs Variant 1:
PROs Variant 2:
Common PROs:
Common CONs:
from
andto
timestamps that are sent in the cloud event, we need to deviate from being totally transparent a little bit:from
andto
timestampsfrom
andto
query parameters accordinglyTL;DR
A short version of the above
keptn/go-utils
(KeptnBase::GetSLIConfiguration
) - including dynatrace-service.Food for thought
Does Keptn need to know how SLIs are specified and stored for a certain provider?
Probably not, because it will help with decoupling SLO from SLI
The only thing that Keptn needs to know and specify is the the payload for the
sh.keptn.event.get-sli.finished
event data. See the current specification.Below you can see an adapted version of this specification, that incorporates our requirements. Please mind the added definitions for
tags
and the added propertytags
toGetSSLIFinishedEventData
as well as the added propertiestags
,displayName
andunit
forSLIResult
.Keptn Get SLI Finished JSON Schema v2.0
What would change for our approaches above if Keptn does not know anything about SLI specs?
Not much, possibly just the values for the
id
fields in some of the JSON Schemas. Also there is no need to define the propertyunit
at SLI provider side then, because it is already defined in Keptn Spec, which is where it belongs.So it will only reduce the schemas above but not add anything to it.
Can we go ahead with any new format or are there any blockers?
Currently we rely on
keptn/go-utils
for retrieving SLI definitions, but it would be easy to just implement that functionality on our own. We have done similar things for other files as well.Can we mix Approaches 1 and 3.1 resp. Approaches 2 and 3.2?
If it would be easy to adapt the specifications in Approach 1 and 2 to check certain API endpoints but also allow a generic endpoint without any validation. So these approaches are not necessarily mutually exclusive.
Can we have multiple SLI files?
In order to separate concerns it would be great for customers to have the possiblity of maintaining multiple SLI files that focus on different domains or parts of the whole system. So they could partition one big SLI file into multiple smaller files based on their organizational setup:
This requirement could only be met with Approach 1 (and 3.1) and will not work with Approach 2 (and 3.2)
Beta Was this translation helpful? Give feedback.
All reactions