From 7ec20dfc4e6cb5956aa8c5ceea483401d291a209 Mon Sep 17 00:00:00 2001 From: Peter Burkholder Date: Thu, 2 Nov 2023 16:28:09 -0400 Subject: [PATCH] Fix broken links and SLO update --- _docs/ops/security-ir.md | 3 ++- _docs/overview/customer-service-objectives.md | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/_docs/ops/security-ir.md b/_docs/ops/security-ir.md index 8356e9cb5..b3b9717d6 100644 --- a/_docs/ops/security-ir.md +++ b/_docs/ops/security-ir.md @@ -30,7 +30,8 @@ At a high level, incident response follows this process: - Determine if the anomaly / service disruption qualifies as an incident. That is: - Is there evidence of compromise or attack? - - Has the system been unable to maintain our [service level objectives]({{ site.baseurl }}{% link _docs/overview/customer-service-objectives/ %})? + - Has the system been unable to maintain our [service level objectives]({{ site.baseurl }}{% link _docs/overview/customer-service-objectives.md %})? + - An availability issue impacting a single customer is likely _not_ an incident - Is an attack imminent or suspected (e.g. a Log4J type vulnerability) - Most reported vulnerabilities are _not_ incidents, and are handled by our SI-02 Flaw Remediation process - Outside cloud.gov: A TTS staff member (the *reporter*) notices and reports a cloud.gov-related incident using the [TTS incident response process](https://handbook.tts.gsa.gov/general-information-and-resources/tech-policies/security-incidents/) and then notifying the cloud.gov team in [`#cg-support`](https://gsa-tts.slack.com/archives/C09CR1Q9Z) diff --git a/_docs/overview/customer-service-objectives.md b/_docs/overview/customer-service-objectives.md index 18bfd135a..c9c80cbe6 100644 --- a/_docs/overview/customer-service-objectives.md +++ b/_docs/overview/customer-service-objectives.md @@ -26,6 +26,8 @@ Our program’s top **priorities** are: Our **performance goals**: - We publicly publish real time metrics regarding platform availability at https://cloudgov.statuspage.io. + - For incidents, we consider less than 90% availability over one hour as an incident (6 minute outage), less than 97% over a 24-hour period (43 minute outage)> + - Our site is "unavailable" if more than 10% of public users are experiencing issues. Our **support availability**: