As an Application Operator, I would like certain CI jobs to be able to access my Kubernetes cluster, connected via GitLab Agent. That way I don't have to open up my cluster to access it from CI.
The user can allow certain CI jobs to access Kubernetes clusters connected via the GitLab Agent.
A single CI job can access multiple clusters, that is to access multiple Agents. This is often required in production environments, where the production environment is composed of multiple clusters in different regions/availability zones.
In the Agent's configuration file, managed as code in the configuration project, user specifies a list of projects and groups, CI jobs from which can access this particular agent. CI jobs of the configuration project itself can access all agents configured via this project (TODO security review).
# .gitlab/agents/my-agent/config.yaml
ci_access:
# This agent is accessible from CI jobs in these projects
projects:
- id: group1/group1-1/project1
default_namespace: namespace-to-use-as-default
environments:
- staging
- review/*
access_as:
agent: {}
impersonate:
username: name-of-identity-to-impersonate
uid: 06f6ce97-e2c5-4ab8-7ba5-7654dd08d52b
groups:
- group1
- group2
extra:
- key: key1
val: ["val1", "val2"]
- key: key2
val: ["x"]
ci_job: {}
ci_user: {}
# This agent is accessible from CI jobs in projects in these groups
groups:
- id: group2/group2-1
default_namespace: ...
environments: ...
access_as: ...
When a CI job, that has access to one or more agents, runs, GitLab injects a
kubectl
-compatible configuration file
(using a variable of type File
) and sets
KUBECONFIG
environment variable
to its location on disk. The file contains a
context
per GitLab Agent that this CI job is allowed to access.
The ci_access.projects[].default_namespace
specifies the namespace for the context used in the CI/CD tunnel. Omitting default_namespace
does not set a namespace in the context.
ci_access.projects[].environments[]
restricts agent usage to CI jobs that deploy to a matching
environment. See Branch and Environment restrictions.
If the project, where the CI job is running, has certificate-based integration configured, then the generated configuration file contains contexts for both integrations. This allows users to use both integration simultaneously, for example to migrate from one to the other.
CI job can set context <context name>
as the current one using kubectl config use-context <context name>
.
A context can also be explicitly specified in each kubectl
invocation using kubectl --context=<context name> <command>
.
After a context is selected, kubectl
(or any other compatible program) can be used as if working with a cluster directly.
We might add another level of authorization from the group side, if requested by users. This is tracked by https://gitlab.com/gitlab-org/gitlab/-/issues/330591 and is initially out of scope for the CI tunnel.
-
Context name
is constructed according to the following pattern:<configuration project full path>:<agent name>
.Example:
groupX/subgroup1/project1:my-agent
. -
Server
is set tohttps://kas.gitlabhost.tld:<port>
. There needs to be only oneNamedCluster
element in the config that all contexts refer to. It'sName
should be set togitlab
. -
Namespace
is set to the value ofprojects[].default_namespace
. -
Token
is set to the value of<token type>:<agent id>:<CI_JOB_TOKEN>
, where:<token type>
is the type of the token that is being provided. For CI integration it's the stringci
. In the future we may have more types of tokens thatgitlab-kas
may accept.<agent id>
is the id of the agent that can be accessed using this context. This value and the context's name are the only unique values across contexts.<CI_JOB_TOKEN>
is the value of theCI_JOB_TOKEN
variable.
When ci_access.projects[].environments[]
is present in an agent's configuration, only CI jobs that deploy
to a matching environment are allowed to use the agent.
One or more environment entries can be specified, where each contains either the environment name or a wildcard environment scope. If the CI job's environment does not match any entries:
- The injected configuration does not have a context for the agent.
- The
allowed_agents
API response does not include the agent.
Environments can also be specified at the group level with ci_access.groups[].environments[]
. If a project is
authorized multiple times (for example, at both the project and group level), only the most specific configuration
is used and environment entries are not merged. See /api/v4/job/allowed_agents
API
for details on configuration specificity.
All identifiers have one of the following structures:
-
gitlab:<identifier type>
-
gitlab:<identifier type>:<identifier type-specific information>
.identifier type-specific information
may contain columns (:
) to separate pieces of information.
User impersonation, when configured, supplies identifying information to the in-cluster access control mechanisms, such as RBAC and admission controllers, when a request is made. This allows Platform Engineers to precisely set up permissions based on groups and/or "extra".
Identity that is used to make an actual Kubernetes API request in a cluster is configured using
the access_as
config section. For any option other than agent
to work, agentk
's
ServiceAccount
needs to have correct permissions. At most one key is allowed:
-
agent
- make the requests using the agent's identity i.e. using theServiceAccount
credentials theagentk
Pod
is running under. This is the default behavior. This is the only impersonation mode where user can use the impersonation functionality from the client. In other modes requests with impersonation headers are rejected with 400 because they can not be fulfilled - those headers are already in use by the impersonation mode and there is no way to perform "nested impersonation". -
impersonate
- make the requests using some identity. -
ci_job
- impersonate the CI job. When the agent makes the request to the actual Kubernetes API, it sets the impersonation credentials in the following way:-
UserName
is set togitlab:ci_job:<job id>
Example:
gitlab:ci_job:1074499489
. -
Groups
is set to:-
gitlab:ci_job
to identify all requests coming from CI jobs. -
The list of ids of groups the project is in.
-
The project id.
-
The slug of the environment this job belongs to.
Example: for a CI job in
group1/group1-1/project1
where:-
Group
group1
has id23
. -
Group
group1/group1-1
has id25
. -
Project
group1/group1-1/project1
has id150
. -
Job running in a
prod
environment, which has theproduction
environment tier.
group list would be [
gitlab:ci_job
,gitlab:group:23
,gitlab:group_env_tier:23:production
,gitlab:group:25
,gitlab:group_env_tier:25:production
,gitlab:project:150
,gitlab:project_env:150:prod
,gitlab:project_env_tier:150:production
]. -
-
Extra
carries extra information about the request:-
agent.gitlab.com/id
contains the agent id. -
agent.gitlab.com/config_project_id
contains the agent's configuration project id. -
agent.gitlab.com/project_id
contains the CI project id. -
agent.gitlab.com/ci_pipeline_id
contains the CI pipeline id. -
agent.gitlab.com/ci_job_id
contains the CI job id. -
agent.gitlab.com/username
contains the username of the user the CI job is running as. -
agent.gitlab.com/environment_slug
contains the slug of the environment. Only set if running in an environment. -
agent.gitlab.com/environment_tier
contains the deployment tier of the environment. Only set if running in an environment.
-
-
-
ci_user
- impersonate the user this CI job is running as. Details depend on https://gitlab.com/gitlab-org/gitlab/-/issues/243740, tentatively:-
UserName
is set togitlab:user:<username>
Example:
gitlab:user:ash2k
. -
Groups
is set to:-
gitlab:user
to identify all requests coming from GitLab users. -
The list of roles the user has in the project where the CI job is running.
Example: for a Maintainer in project
group1/group1-1/project1
with id150
the list of groups would be [gitlab:user
,gitlab:project_role:150:reporter
,gitlab:project_role:150:developer
,gitlab:project_role:150:maintainer
] -
-
Extra
- see above.
Full list of groups for a user can be huge, so it was decided to use a list of roles the user has instead.
-
Group/project ids are used because:
-
group/project names can be sensitive information that should not be exposed.
-
group/project names can change over time, breaking permissions set in RBAC.
Requests to https://kas.gitlabhost.tld:<port>
are authenticated using the CI_JOB_TOKEN
that is passed in each request.
There are two authorization steps, performed in the following order:
-
Coarse-grained authorization: the CI job, identified by the supplied
CI_JOB_TOKEN
, is checked to see if it is allowed to access a particular agent, identified by the supplied agent id. Note that any agent id can be supplied by manipulating the configuration file, but only the agent ids that are allowed to be accessed from that particular CI job are allowed to pass this authorization step. -
Fine-grained authorization: performed by the in-cluster access control mechanisms, configured by the Platform Engineer. Information, described in the Impersonation section above, can be used to define what is allowed.
Be default, the agent should work without an agent configuration file as well. The following configuration should be the default:
# .gitlab/agents/<agent name>/config.yaml
ci_access:
projects:
- id: "<agent's configuration project id>"
access_as:
agent: {}
According to the proposal, user maintains the list of groups and/or projects
in the agent's configuration file. This can be thought of as agent id
-> allowed project id
and
agent id
-> allowed group id
indexes. We need reverse of these i.e. information about agents, allowed for a
project/group to access. It is needed to:
-
Implement the
/api/v4/job/allowed_agents
API endpoint, providing the list of allowed agents with their configuration. -
To be able to construct the
kubectl
configuration file.
https://gitlab.com/gitlab-org/gitlab/-/issues/323708 tracks the plumbing work to make it possible to build such an index. Once it is implemented, we need to add new indexes to be able to perform:
ci project id
->agent id
lookups: https://gitlab.com/gitlab-org/gitlab/-/issues/327411group id
->agent id
lookups: https://gitlab.com/gitlab-org/gitlab/-/issues/327851
/api/v4/job/allowed_agents
is a new endpoint that returns the required data:
- Information about the CI job, pipeline, project, user.
- The list of agent ids that this CI job is allowed to access.
Only the needed fields are returned, not everything. Algorithm:
-
Retrieve the list of agents allowed to be accessed from the CI project by querying the
ci project id
->agent id
index. -
Retrieve the list of agents configured in the CI project, if any. These are allowed to be accessed from CI jobs implicitly with default configuration. The user can set configuration by explicitly granting access to the configuration project - to allow that, explicit grants are prioritized over implicit configuration.
-
Gather an ordered (from more nested/inner to less nested/outer) list of groups for the CI project by querying the
group id
->agent id
index.Example: for project
group1/group1-1/project1
the list would be [group1/group1-1
,group1
]. -
For each group fetch the list of agents, allowed to be accessed by that group. If an agent id has already been seen either on step 1 or this step, discard the found information. Keep the most specific configuration for the agent.
Example: for project
group1/group1-1/project1
the configuration specificity order is:- Project-level configuration
group1/group1-1/project1
. - Inner-most group configuration
group1/group1-1
. - Outer group configuration
group1
.
- Project-level configuration
-
TBD What happens if user grants access to a group, containing the agent configuration project? Does it override the implicit configuration or not?
-
Collate information from above and return it.
Request:
GET /api/v4/job/allowed_agents
Accept: application/json
Job-Token: <CI_JOB_TOKEN>
Job-Token
header name is consistent with other API endpoints that use CI_JOB_TOKEN
for authentication.
Response on success:
HTTP/1.1 200 OK
Content-Type: application/json
{
"allowed_agents": [
{
"id": 5, // agent id
"config_project": {
"id": 3
},
"configuration": { // contains section of the agent's config file as is, with 'id' removed
"default_namespace": "namespace-to-use-as-default",
"access_as": {
"agent: {}
}
}
},
{
"id": 3,
"config_project": {
"id": 3 // same as above
},
"configuration": {
// "default_namespace": "", // not set
"access_as": {
"ci_job: {}
}
}
},
{
"id": 10,
"config_project": {
"id": 11 // agent from a different project
},
"configuration": {
"access_as": {
"ci_user: {}
}
}
}
],
"job": {
"id": 3 // job id
},
"pipeline": {
"id": 6 // pipeline id
},
"project": {
"id": 150, // project id
"groups": [
{
"id": 23 // id of the group this project is in
},
{
"id": 25
}
]
},
"environment": {
"slug": "slug_of_the_environment" // empty if not part of an environment
"tier": "deployment_tier_of_the_environment" // empty if not part of an environment
},
"user": { // user who is running the job
"id": 1,
"username": "root",
"roles_in_project": [
"reporter", "developer", "maintainer"
]
}
}
/api/v4/internal/kubernetes/agent_configuration
is a new endpoint that accepts configuration for an agent and
updates necessary records in DB. It is invoked by kas
each time it fetches an updated agent configuration.
If there is an error invoking the endpoint, kas
still proceeds with returning the configuration to the agent to avoid
impacting the user if there is an internal communication issue.
kas
might send the same configuration more than once because it sends it on each new commit, even if there are no
changes. This is consistent with kas
sending configuration and GitOps manifests on each commit.
We may optimize all three later or handle this on the Rails side to avoid doing duplicate work and
causing unnecessary DB load. One option is to cache agent id
-> configuration hash
in Redis and
compare the new/cached hashes before making any queries to the DB. This is not in scope of this document.
Sending "duplicate" configuration has certain benefits:
- Simpler to implement.
- If for any reason DB is not in sync (e.g. network errors), it will be updated eventually (on next commit).
Request:
POST /api/v4/internal/kubernetes/agent_configuration
Content-Type: application/json
Gitlab-Kas-Api-Request: JWT token
{
"agent_id": 5,
"agent_config": {} // ConfigurationFile in pkg/agentcfg/agentcfg.proto
}
Response on success:
HTTP/1.1 204 No content
Errors:
- If JWT token is invalid or missing, a corresponding HTTP status code is returned (401/403).
- If agent id does not exist, HTTP status code 400 is returned.
-
gitlab-kas
gets a request from the CI job withCI_JOB_TOKEN
and agent id in it.-
If
CI_JOB_TOKEN
is missing, the request is rejected with HTTP code 401. -
if agent id is missing or invalid, the request is rejected with HTTP code 400.
-
-
gitlab-kas
makes a request to/api/v4/job/allowed_agents
endpoint to get the information about theCI_JOB_TOKEN
it received.- It handles the HTTP status codes, returning 401/403 on 401/403 i.e. when
CI_JOB_TOKEN
is invalid.
- It handles the HTTP status codes, returning 401/403 on 401/403 i.e. when
-
gitlab-kas
checks if the agent id it got in the request from the CI job is in the list it got from/api/v4/job/allowed_agents
. If it is not, the request is rejected with HTTP code 403. -
(optional)
gitlab-kas
adds impersonation headers to the request based on the agent's configuration. -
gitlab-kas
proxies the request to the destination agent, identified by the agent id from the request. Seegitlab-kas
request routing for information on how the request routing works.