Access to Kubernetes from CI

Problem to solve

As an Application Operator, I would like certain CI jobs to be able to access my Kubernetes cluster, connected via GitLab Agent. That way I don't have to open up my cluster to access it from CI.

Intended users

Allison (Application Ops)
Priyanka (Platform Engineer)

User experience goal

The user can allow certain CI jobs to access Kubernetes clusters connected via the GitLab Agent.

A single CI job can access multiple clusters, that is to access multiple Agents. This is often required in production environments, where the production environment is composed of multiple clusters in different regions/availability zones.

Proposal

In the Agent's configuration file, managed as code in the configuration project, user specifies a list of projects and groups, CI jobs from which can access this particular agent. CI jobs of the configuration project itself can access all agents configured via this project (TODO security review).

# .gitlab/agents/my-agent/config.yaml
ci_access:
  # This agent is accessible from CI jobs in these projects
  projects:
    - id: group1/group1-1/project1
      default_namespace: namespace-to-use-as-default
      environments:
        - staging
        - review/*
      access_as:
        agent: {}
        impersonate:
          username: name-of-identity-to-impersonate
          uid: 06f6ce97-e2c5-4ab8-7ba5-7654dd08d52b
          groups:
            - group1
            - group2
          extra:
            - key: key1
              val: ["val1", "val2"]
            - key: key2
              val: ["x"]
        ci_job: {}
        ci_user: {}
  # This agent is accessible from CI jobs in projects in these groups
  groups:
    - id: group2/group2-1
      default_namespace: ...
      environments: ...
      access_as: ...

When a CI job, that has access to one or more agents, runs, GitLab injects a kubectl-compatible configuration file (using a variable of type File) and sets KUBECONFIG environment variable to its location on disk. The file contains a context per GitLab Agent that this CI job is allowed to access.

The ci_access.projects[].default_namespace specifies the namespace for the context used in the CI/CD tunnel. Omitting default_namespace does not set a namespace in the context.

ci_access.projects[].environments[] restricts agent usage to CI jobs that deploy to a matching environment. See Branch and Environment restrictions.

If the project, where the CI job is running, has certificate-based integration configured, then the generated configuration file contains contexts for both integrations. This allows users to use both integration simultaneously, for example to migrate from one to the other.

CI job can set context <context name> as the current one using kubectl config use-context <context name>. A context can also be explicitly specified in each kubectl invocation using kubectl --context=<context name> <command>.

After a context is selected, kubectl (or any other compatible program) can be used as if working with a cluster directly.

We might add another level of authorization from the group side, if requested by users. This is tracked by https://gitlab.com/gitlab-org/gitlab/-/issues/330591 and is initially out of scope for the CI tunnel.

Implementation

`kubectl` configuration file

Context name is constructed according to the following pattern: <configuration project full path>:<agent name>.

Example: groupX/subgroup1/project1:my-agent.
Server is set to https://kas.gitlabhost.tld:<port>. There needs to be only one NamedCluster element in the config that all contexts refer to. It's Name should be set to gitlab.
Namespace is set to the value of projects[].default_namespace.
Token is set to the value of <token type>:<agent id>:<CI_JOB_TOKEN>, where:
- <token type> is the type of the token that is being provided. For CI integration it's the string ci. In the future we may have more types of tokens that gitlab-kas may accept.
- <agent id> is the id of the agent that can be accessed using this context. This value and the context's name are the only unique values across contexts.
- <CI_JOB_TOKEN> is the value of the CI_JOB_TOKEN variable.

Branch and Environment restrictions

When ci_access.projects[].environments[] is present in an agent's configuration, only CI jobs that deploy to a matching environment are allowed to use the agent.

One or more environment entries can be specified, where each contains either the environment name or a wildcard environment scope. If the CI job's environment does not match any entries:

The injected configuration does not have a context for the agent.
The allowed_agents API response does not include the agent.

Environments can also be specified at the group level with ci_access.groups[].environments[]. If a project is authorized multiple times (for example, at both the project and group level), only the most specific configuration is used and environment entries are not merged. See /api/v4/job/allowed_agents API for details on configuration specificity.

Identifiers

All identifiers have one of the following structures:

gitlab:<identifier type>
gitlab:<identifier type>:<identifier type-specific information>. identifier type-specific information may contain columns (:) to separate pieces of information.

Impersonation

User impersonation, when configured, supplies identifying information to the in-cluster access control mechanisms, such as RBAC and admission controllers, when a request is made. This allows Platform Engineers to precisely set up permissions based on groups and/or "extra".

Identity that is used to make an actual Kubernetes API request in a cluster is configured using the access_as config section. For any option other than agent to work, agentk's ServiceAccount needs to have correct permissions. At most one key is allowed:

agent - make the requests using the agent's identity i.e. using the ServiceAccount credentials the agentk Pod is running under. This is the default behavior. This is the only impersonation mode where user can use the impersonation functionality from the client. In other modes requests with impersonation headers are rejected with 400 because they can not be fulfilled - those headers are already in use by the impersonation mode and there is no way to perform "nested impersonation".
impersonate - make the requests using some identity.
ci_job - impersonate the CI job. When the agent makes the request to the actual Kubernetes API, it sets the impersonation credentials in the following way:
- UserName is set to gitlab:ci_job:<job id>
  
  Example: gitlab:ci_job:1074499489.
- Groups is set to:
  - gitlab:ci_job to identify all requests coming from CI jobs.
  - The list of ids of groups the project is in.
  - The project id.
  - The slug of the environment this job belongs to.
  Example: for a CI job in group1/group1-1/project1 where:
  - Group group1 has id 23.
  - Group group1/group1-1 has id 25.
  - Project group1/group1-1/project1 has id 150.
  - Job running in a prod environment, which has the production environment tier.
  group list would be [gitlab:ci_job, gitlab:group:23, gitlab:group_env_tier:23:production, gitlab:group:25, gitlab:group_env_tier:25:production, gitlab:project:150, gitlab:project_env:150:prod, gitlab:project_env_tier:150:production].
- Extra carries extra information about the request:
  - agent.gitlab.com/id contains the agent id.
  - agent.gitlab.com/config_project_id contains the agent's configuration project id.
  - agent.gitlab.com/project_id contains the CI project id.
  - agent.gitlab.com/ci_pipeline_id contains the CI pipeline id.
  - agent.gitlab.com/ci_job_id contains the CI job id.
  - agent.gitlab.com/username contains the username of the user the CI job is running as.
  - agent.gitlab.com/environment_slug contains the slug of the environment. Only set if running in an environment.
  - agent.gitlab.com/environment_tier contains the deployment tier of the environment. Only set if running in an environment.
ci_user - impersonate the user this CI job is running as. Details depend on https://gitlab.com/gitlab-org/gitlab/-/issues/243740, tentatively:
- UserName is set to gitlab:user:<username>
  
  Example: gitlab:user:ash2k.
- Groups is set to:
  - gitlab:user to identify all requests coming from GitLab users.
  - The list of roles the user has in the project where the CI job is running.
  Example: for a Maintainer in project group1/group1-1/project1 with id 150 the list of groups would be [gitlab:user, gitlab:project_role:150:reporter, gitlab:project_role:150:developer, gitlab:project_role:150:maintainer]
- Extra - see above.
Full list of groups for a user can be huge, so it was decided to use a list of roles the user has instead.

Group/project ids are used because:

group/project names can be sensitive information that should not be exposed.
group/project names can change over time, breaking permissions set in RBAC.

Authentication

Requests to https://kas.gitlabhost.tld:<port> are authenticated using the CI_JOB_TOKEN that is passed in each request.

Authorization

There are two authorization steps, performed in the following order:

Coarse-grained authorization: the CI job, identified by the supplied CI_JOB_TOKEN, is checked to see if it is allowed to access a particular agent, identified by the supplied agent id. Note that any agent id can be supplied by manipulating the configuration file, but only the agent ids that are allowed to be accessed from that particular CI job are allowed to pass this authorization step.
Fine-grained authorization: performed by the in-cluster access control mechanisms, configured by the Platform Engineer. Information, described in the Impersonation section above, can be used to define what is allowed.

Default configuration

Be default, the agent should work without an agent configuration file as well. The following configuration should be the default:

# .gitlab/agents/<agent name>/config.yaml
ci_access:
  projects:
    - id: "<agent's configuration project id>"
      access_as:
        agent: {}

Notifying GitLab of agent's configuration

According to the proposal, user maintains the list of groups and/or projects in the agent's configuration file. This can be thought of as agent id -> allowed project id and agent id -> allowed group id indexes. We need reverse of these i.e. information about agents, allowed for a project/group to access. It is needed to:

Implement the /api/v4/job/allowed_agents API endpoint, providing the list of allowed agents with their configuration.
To be able to construct the kubectl configuration file.

https://gitlab.com/gitlab-org/gitlab/-/issues/323708 tracks the plumbing work to make it possible to build such an index. Once it is implemented, we need to add new indexes to be able to perform:

ci project id -> agent id lookups: https://gitlab.com/gitlab-org/gitlab/-/issues/327411
group id -> agent id lookups: https://gitlab.com/gitlab-org/gitlab/-/issues/327851

`/api/v4/job/allowed_agents` API

/api/v4/job/allowed_agents is a new endpoint that returns the required data:

Information about the CI job, pipeline, project, user.
The list of agent ids that this CI job is allowed to access.

Only the needed fields are returned, not everything. Algorithm:

Retrieve the list of agents allowed to be accessed from the CI project by querying the ci project id -> agent id index.
Retrieve the list of agents configured in the CI project, if any. These are allowed to be accessed from CI jobs implicitly with default configuration. The user can set configuration by explicitly granting access to the configuration project - to allow that, explicit grants are prioritized over implicit configuration.
Gather an ordered (from more nested/inner to less nested/outer) list of groups for the CI project by querying the group id -> agent id index.

Example: for project group1/group1-1/project1 the list would be [group1/group1-1, group1].
For each group fetch the list of agents, allowed to be accessed by that group. If an agent id has already been seen either on step 1 or this step, discard the found information. Keep the most specific configuration for the agent.

Example: for project group1/group1-1/project1 the configuration specificity order is:
1. Project-level configuration group1/group1-1/project1.
2. Inner-most group configuration group1/group1-1.
3. Outer group configuration group1.
TBD What happens if user grants access to a group, containing the agent configuration project? Does it override the implicit configuration or not?
Collate information from above and return it.

Request:

GET /api/v4/job/allowed_agents
Accept: application/json
Job-Token: <CI_JOB_TOKEN>

Job-Token header name is consistent with other API endpoints that use CI_JOB_TOKEN for authentication.

Response on success:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "allowed_agents": [
    {
      "id": 5, // agent id
      "config_project": {
        "id": 3
      },
      "configuration": { // contains section of the agent's config file as is, with 'id' removed
        "default_namespace": "namespace-to-use-as-default",
        "access_as": {
          "agent: {}
        }
      }
    },
    {
      "id": 3,
      "config_project": {
        "id": 3 // same as above
      },
      "configuration": {
        // "default_namespace": "", // not set
        "access_as": {
          "ci_job: {}
        }
      }
    },
    {
      "id": 10,
      "config_project": {
        "id": 11 // agent from a different project
      },
      "configuration": {
        "access_as": {
          "ci_user: {}
        }
      }
    }
  ],
  "job": {
    "id": 3 // job id
  },
  "pipeline": {
    "id": 6 // pipeline id
  },
  "project": {
    "id": 150, // project id
    "groups": [
      {
        "id": 23 // id of the group this project is in
      },
      {
        "id": 25
      }
    ]
  },
  "environment": {
    "slug": "slug_of_the_environment" // empty if not part of an environment
    "tier": "deployment_tier_of_the_environment" // empty if not part of an environment
  },
  "user": { // user who is running the job
    "id": 1,
    "username": "root",
    "roles_in_project": [
      "reporter", "developer", "maintainer"
    ]
  }
}

`/api/v4/internal/kubernetes/agent_configuration` API

/api/v4/internal/kubernetes/agent_configuration is a new endpoint that accepts configuration for an agent and updates necessary records in DB. It is invoked by kas each time it fetches an updated agent configuration.

If there is an error invoking the endpoint, kas still proceeds with returning the configuration to the agent to avoid impacting the user if there is an internal communication issue.

kas might send the same configuration more than once because it sends it on each new commit, even if there are no changes. This is consistent with kas sending configuration and GitOps manifests on each commit. We may optimize all three later or handle this on the Rails side to avoid doing duplicate work and causing unnecessary DB load. One option is to cache agent id -> configuration hash in Redis and compare the new/cached hashes before making any queries to the DB. This is not in scope of this document.

Sending "duplicate" configuration has certain benefits:

Simpler to implement.
If for any reason DB is not in sync (e.g. network errors), it will be updated eventually (on next commit).

Request:

POST /api/v4/internal/kubernetes/agent_configuration
Content-Type: application/json
Gitlab-Kas-Api-Request: JWT token

{
  "agent_id": 5,
  "agent_config": {} // ConfigurationFile in pkg/agentcfg/agentcfg.proto
}

Response on success:

HTTP/1.1 204 No content

Errors:

If JWT token is invalid or missing, a corresponding HTTP status code is returned (401/403).
If agent id does not exist, HTTP status code 400 is returned.

Request proxying flow

gitlab-kas gets a request from the CI job with CI_JOB_TOKEN and agent id in it.
- If CI_JOB_TOKEN is missing, the request is rejected with HTTP code 401.
- if agent id is missing or invalid, the request is rejected with HTTP code 400.
gitlab-kas makes a request to /api/v4/job/allowed_agents endpoint to get the information about the CI_JOB_TOKEN it received.
- It handles the HTTP status codes, returning 401/403 on 401/403 i.e. when CI_JOB_TOKEN is invalid.
gitlab-kas checks if the agent id it got in the request from the CI job is in the list it got from /api/v4/job/allowed_agents. If it is not, the request is rejected with HTTP code 403.
(optional) gitlab-kas adds impersonation headers to the request based on the agent's configuration.
gitlab-kas proxies the request to the destination agent, identified by the agent id from the request. See gitlab-kas request routing for information on how the request routing works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubernetes_ci_access.md

kubernetes_ci_access.md

Access to Kubernetes from CI

Problem to solve

Intended users

User experience goal

Proposal

Implementation

`kubectl` configuration file

Branch and Environment restrictions

Identifiers

Impersonation

Authentication

Authorization

Default configuration

Notifying GitLab of agent's configuration

`/api/v4/job/allowed_agents` API

`/api/v4/internal/kubernetes/agent_configuration` API

Request proxying flow

Files

kubernetes_ci_access.md

Latest commit

History

kubernetes_ci_access.md

File metadata and controls

Access to Kubernetes from CI

Problem to solve

Intended users

User experience goal

Proposal

Implementation

kubectl configuration file

Branch and Environment restrictions

Identifiers

Impersonation

Authentication

Authorization

Default configuration

Notifying GitLab of agent's configuration

/api/v4/job/allowed_agents API

/api/v4/internal/kubernetes/agent_configuration API

Request proxying flow

`kubectl` configuration file

`/api/v4/job/allowed_agents` API

`/api/v4/internal/kubernetes/agent_configuration` API