Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implementing certificate expiry detail in security dashboard #3000

Merged
merged 15 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions conf/rest/9.12.0/certificate.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@

name: Certificate
query: api/security/certificates
object: certificate

counters:
Hardikl marked this conversation as resolved.
Show resolved Hide resolved
- ^^uuid
- ^expiry_time => expiry_time
- ^name
- ^scope => scope
- ^svm.name => svm
- ^type => type
- expiry_time(timestamp) => expiry_time

export_options:
instance_keys:
- expiry_time
- name
- uuid
instance_labels:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to write a Prometheus alert for expiry time, what should the query be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is label, it's quite difficult to write alert, Let me explore to handle this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to use metric instead of label for this:
This is the alert query for certificates expiring within 1 month:
0 < (certificate_expiry_time{} - time()) < (30*24*3600)

I will add sample warning alert as well for reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good. Also let's add a alert for expired certificates.

- scope
- svm
- type
1 change: 1 addition & 0 deletions conf/rest/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ schedule:

objects:
Aggregate: aggr.yaml
Certificate: certificate.yaml
# The CIFSSession template may slow down data collection due to a high number of metrics.
# CIFSSession: cifs_session.yaml
# CIFSShare: cifs_share.yaml
Expand Down
21 changes: 20 additions & 1 deletion container/prometheus/alert_rules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,4 +101,23 @@ groups:
severity: "warning"
annotations:
summary: "{{ $labels.object }} [{{ $labels.volume }}] deleted"
description: "{{ $labels.object }} [{{ $labels.volume }}] deleted"
description: "{{ $labels.object }} [{{ $labels.volume }}] deleted"

# Certificates expiring within 1 month
- alert: Certificates expiring within 1 month
expr: 0 < (certificate_expiry_time{} - time()) < (30*24*3600)
for: 1m
labels:
severity: "warning"
annotations:
summary: "Certificate [{{ $labels.name }}] will be expiring on [{{ $labels.expiry_time }}]"
description: "Certificate [{{ $labels.name }}] will be expiring on [{{ $labels.expiry_time }}]"

# Certificates expired
- alert: Certificates expired
expr: (certificate_expiry_time{} - time()) < 0
labels:
severity: "critical"
annotations:
summary: "Certificate [{{ $labels.name }}] has been expired on [{{ $labels.expiry_time }}]"
description: "Certificate [{{ $labels.name }}] has been expired on [{{ $labels.expiry_time }}]"
183 changes: 180 additions & 3 deletions grafana/dashboards/cmode/security.json
Original file line number Diff line number Diff line change
Expand Up @@ -1900,14 +1900,191 @@
],
"type": "stat"
},
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this dashboard, root SVMs are excluded by default. Are we sure that root SVM certificates also need to be excluded for certificates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, While checking the security/certificates Rest call, there are no certificates records for root svms. We are good to go here.

Also, I would be adding the scope field, which shows cluster or svm to help customer to see the scope of the certificate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying:
a) that root SVMs do not have certificates or
b) that ONTAP does not return certificates for root SVMs

I think you're saying that root SVM have certificates, but ONTAP is not returning them? If that's the case, we should check the expiry for root SVM certificates some other way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there is little correction on my above note.

  • Out of admin and node svm (which we treat as root svm), admin svm can have certificates
  • But, It's cluster scope and not svm scope, (scope is only in Rest) which means we don't get svm name in certificate in Rest calls.
  • We would do little more work to get those detail in Rest, which we are already doing in existing plugin, and we have this comment as well for that reference.
    // Admin SVM certificate is cluster scoped, but the REST API does not return the SVM name in its response. Add here for ZAPI parity
  • As there are cluster scoped and svm scoped certificates in table, I need to remove the svm filter from query,
    So, even the SVM drop down have limited svms but this table shows certificates from all of them.
    Screenshot from .127 system
image

Just to note, Above the table the stats count is showing those admin svm's certificates only(admin svm is unique in cluster) and not all of them.

"datasource": "${DS_PROMETHEUS}",
"description": "This panel requires Harvest REST collector.\n\nThis panel displays Certificate expiration time.",
"fieldConfig": {
"defaults": {
"color": {
"fixedColor": "transparent",
"mode": "fixed"
},
"custom": {
"align": "left",
"cellOptions": {
"type": "auto"
},
"filterable": true,
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
}
]
},
"unitScale": true
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "Expires In"
},
"properties": [
{
"id": "unit",
"value": "dateTimeFromNow"
}
]
},
{
"matcher": {
"id": "byName",
"options": "cluster"
},
"properties": [
{
"id": "displayName",
"value": "Cluster"
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "",
"url": "/d/cdot-cluster/ontap-cluster?orgId=1&${Datacenter:queryparam}&${__url_time_range}&var-Cluster=${__value.raw}"
}
]
}
]
},
{
"matcher": {
"id": "byName",
"options": "svm"
},
"properties": [
{
"id": "displayName",
"value": "SVM"
},
{
"id": "links",
"value": [
{
"targetBlank": true,
"title": "",
"url": "/d/cdot-svm/ontap-svm?orgId=1&${Datacenter:queryparam}&${Cluster:queryparam}&${__url_time_range}&var-SVM=${__value.raw}"
}
]
}
]
}
]
},
"gridPos": {
"h": 7,
"w": 24,
"x": 0,
"y": 26
},
"id": 228,
"options": {
"cellHeight": "sm",
"footer": {
"countRows": false,
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"showHeader": true
},
"pluginVersion": "10.3.1",
"targets": [
{
"datasource": "${DS_PROMETHEUS}",
"editorMode": "code",
"exemplar": false,
"expr": "certificate_labels{datacenter=~\"$Datacenter\", cluster=~\"$Cluster\"}",
"format": "table",
"hide": false,
"instant": true,
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"title": "SSL Certificates Expiration",
"transformations": [
{
"id": "filterFieldsByName",
"options": {
"include": {
"names": [
"cluster",
"expiry_time",
"name",
"scope",
"svm",
"type"
]
}
}
},
{
"id": "calculateField",
"options": {
"alias": "Expiry Date",
"mode": "reduceRow",
"reduce": {
"include": [
"expiry_time"
],
"reducer": "last"
}
}
},
{
"id": "organize",
"options": {
"excludeByName": {},
"includeByName": {},
"indexByName": {
"Expiry Date": 5,
"cluster": 0,
"expiry_time": 6,
"name": 2,
"scope": 4,
"svm": 1,
"type": 3
},
"renameByName": {
"expiry_time": "Expires In",
"name": "Certificate Name",
"scope": "Scope",
"type": "Type"
}
}
}
],
"type": "table"
},
{
"collapsed": true,
"datasource": "${DS_PROMETHEUS}",
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 26
"y": 33
},
"id": 12,
"panels": [
Expand Down Expand Up @@ -2253,7 +2430,7 @@
"h": 1,
"w": 24,
"x": 0,
"y": 29
"y": 36
},
"id": 223,
"panels": [
Expand Down Expand Up @@ -3830,7 +4007,7 @@
"h": 1,
"w": 24,
"x": 0,
"y": 30
"y": 37
},
"id": 227,
"panels": [
Expand Down