If you are using a released version of Kubernetes, you should refer to the docs that go with that version.
The latest release of this document can be found [here](http://releases.k8s.io/release-1.1/docs/user-guide/production-pods.md).Documentation for other releases can be found at releases.k8s.io.
Table of Contents
You’ve seen how to configure and deploy pods and containers, using some of the most common configuration parameters. This section dives into additional features that are especially useful for running applications in production.
The container file system only lives as long as the container does, so when a container crashes and restarts, changes to the filesystem will be lost and the container will restart from a clean slate. To access more-persistent storage, outside the container file system, you need a volume. This is especially important to stateful applications, such as key-value stores and databases.
For example, Redis is a key-value cache and store, which we use in the guestbook and other examples. We can add a volume to it to store persistent data as follows:
apiVersion: v1
kind: ReplicationController
metadata:
name: redis
spec:
template:
metadata:
labels:
app: redis
tier: backend
spec:
# Provision a fresh volume for the pod
volumes:
- name: data
emptyDir: {}
containers:
- name: redis
image: kubernetes/redis:v1
ports:
- containerPort: 6379
# Mount the volume into the pod
volumeMounts:
- mountPath: /redis-master-data
name: data # must match the name of the volume, above
emptyDir
volumes live for the lifespan of the pod, which is longer than the lifespan of any one container, so if the container fails and is restarted, our storage will live on.
In addition to the local disk storage provided by emptyDir
, Kubernetes supports many different network-attached storage solutions, including PD on GCE and EBS on EC2, which are preferred for critical data, and will handle details such as mounting and unmounting the devices on the nodes. See the volumes doc for more details.
Many applications need credentials, such as passwords, OAuth tokens, and TLS keys, to authenticate with other applications, databases, and services. Storing these credentials in container images or environment variables is less than ideal, since the credentials can then be copied by anyone with access to the image, pod/container specification, host file system, or host Docker daemon.
Kubernetes provides a mechanism, called secrets, that facilitates delivery of sensitive credentials to applications. A Secret
is a simple resource containing a map of data. For instance, a simple secret with a username and password might look as follows:
apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data:
password: dmFsdWUtMg0K
username: dmFsdWUtMQ0K
As with other resources, this secret can be instantiated using create
and can be viewed with get
:
$ kubectl create -f ./secret.yaml
secrets/mysecret
$ kubectl get secrets
NAME TYPE DATA
default-token-v9pyz kubernetes.io/service-account-token 2
mysecret Opaque 2
To use the secret, you need to reference it in a pod or pod template. The secret
volume source enables you to mount it as an in-memory directory into your containers.
apiVersion: v1
kind: ReplicationController
metadata:
name: redis
spec:
template:
metadata:
labels:
app: redis
tier: backend
spec:
volumes:
- name: data
emptyDir: {}
- name: supersecret
secret:
secretName: mysecret
containers:
- name: redis
image: kubernetes/redis:v1
ports:
- containerPort: 6379
# Mount the volume into the pod
volumeMounts:
- mountPath: /redis-master-data
name: data # must match the name of the volume, above
- mountPath: /var/run/secrets/super
name: supersecret
For more details, see the secrets document, example and design doc.
Secrets can also be used to pass image registry credentials.
First, create a .dockercfg
file, such as running docker login <registry.domain>
.
Then put the resulting .dockercfg
file into a secret resource. For example:
$ docker login
Username: janedoe
Password: ●●●●●●●●●●●
Email: [email protected]
WARNING: login credentials saved in /Users/jdoe/.dockercfg.
Login Succeeded
$ echo $(cat ~/.dockercfg)
{ "https://index.docker.io/v1/": { "auth": "ZmFrZXBhc3N3b3JkMTIK", "email": "[email protected]" } }
$ cat ~/.dockercfg | base64
eyAiaHR0cHM6Ly9pbmRleC5kb2NrZXIuaW8vdjEvIjogeyAiYXV0aCI6ICJabUZyWlhCaGMzTjNiM0prTVRJSyIsICJlbWFpbCI6ICJqZG9lQGV4YW1wbGUuY29tIiB9IH0K
$ cat > /tmp/image-pull-secret.yaml <<EOF
apiVersion: v1
kind: Secret
metadata:
name: myregistrykey
data:
.dockercfg: eyAiaHR0cHM6Ly9pbmRleC5kb2NrZXIuaW8vdjEvIjogeyAiYXV0aCI6ICJabUZyWlhCaGMzTjNiM0prTVRJSyIsICJlbWFpbCI6ICJqZG9lQGV4YW1wbGUuY29tIiB9IH0K
type: kubernetes.io/dockercfg
EOF
$ kubectl create -f ./image-pull-secret.yaml
secrets/myregistrykey
Now, you can create pods which reference that secret by adding an imagePullSecrets
section to a pod definition.
apiVersion: v1
kind: Pod
metadata:
name: foo
spec:
containers:
- name: foo
image: janedoe/awesomeapp:v1
imagePullSecrets:
- name: myregistrykey
Pods support running multiple containers co-located together. They can be used to host vertically integrated application stacks, but their primary motivation is to support auxiliary helper programs that assist the primary application. Typical examples are data pullers, data pushers, and proxies.
Such containers typically need to communicate with one another, often through the file system. This can be achieved by mounting the same volume into both containers. An example of this pattern would be a web server with a program that polls a git repository for new updates:
apiVersion: v1
kind: ReplicationController
metadata:
name: my-nginx
spec:
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: www-data
emptyDir: {}
containers:
- name: nginx
image: nginx
# This container reads from the www-data volume
volumeMounts:
- mountPath: /srv/www
name: www-data
readOnly: true
- name: git-monitor
image: myrepo/git-monitor
env:
- name: GIT_REPO
value: http://github.com/some/repo.git
# This container writes to the www-data volume
volumeMounts:
- mountPath: /data
name: www-data
More examples can be found in our blog article and presentation slides.
Kubernetes’s scheduler will place applications only where they have adequate CPU and memory, but it can only do so if it knows how much resources they require. The consequence of specifying too little CPU is that the containers could be starved of CPU if too many other containers were scheduled onto the same node. Similarly, containers could die unpredictably due to running out of memory if no memory were requested, which can be especially likely for large-memory applications.
If no resource requirements are specified, a nominal amount of resources is assumed. (This default is applied via a LimitRange for the default Namespace. It can be viewed with kubectl describe limitrange limits
.) You may explicitly specify the amount of resources required as follows:
apiVersion: v1
kind: ReplicationController
metadata:
name: my-nginx
spec:
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
resources:
limits:
# cpu units are cores
cpu: 500m
# memory units are bytes
memory: 64Mi
requests:
# cpu units are cores
cpu: 500m
# memory units are bytes
memory: 64Mi
The container will die due to OOM (out of memory) if it exceeds its specified limit, so specifying a value a little higher than expected generally improves reliability. By specifying request, pod is guaranteed to be able to use that much of resource when needed. See Resource QoS for the difference between resource limits and requests.
If you’re not sure how much resources to request, you can first launch the application without specifying resources, and use resource usage monitoring to determine appropriate values.
Many applications running for long periods of time eventually transition to broken states, and cannot recover except by restarting them. Kubernetes provides liveness probes to detect and remedy such situations.
A common way to probe an application is using HTTP, which can be specified as follows:
apiVersion: v1
kind: ReplicationController
metadata:
name: my-nginx
spec:
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
livenessProbe:
httpGet:
# Path to probe; should be cheap, but representative of typical behavior
path: /index.html
port: 80
initialDelaySeconds: 30
timeoutSeconds: 1
Other times, applications are only temporarily unable to serve, and will recover on their own. Typically in such cases you’d prefer not to kill the application, but don’t want to send it requests, either, since the application won’t respond correctly or at all. A common such scenario is loading large data or configuration files during application startup. Kubernetes provides readiness probes to detect and mitigate such situations. Readiness probes are configured similarly to liveness probes, just using the readinessProbe
field. A pod with containers reporting that they are not ready will not receive traffic through Kubernetes services.
For more details (e.g., how to specify command-based probes), see the example in the walkthrough, the standalone example, and the documentation.
Of course, nodes and applications may fail at any time, but many applications benefit from clean shutdown, such as to complete in-flight requests, when the termination of the application is deliberate. To support such cases, Kubernetes supports two kinds of notifications:
- Kubernetes will send SIGTERM to applications, which can be handled in order to effect graceful termination. SIGKILL is sent a configurable number of seconds later if the application does not terminate sooner (defaults to 30 seconds, controlled by
spec.terminationGracePeriodSeconds
). - Kubernetes supports the (optional) specification of a pre-stop lifecycle hook, which will execute prior to sending SIGTERM.
The specification of a pre-stop hook is similar to that of probes, but without the timing-related parameters. For example:
apiVersion: v1
kind: ReplicationController
metadata:
name: my-nginx
spec:
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
lifecycle:
preStop:
exec:
# SIGTERM triggers a quick exit; gracefully terminate instead
command: ["/usr/sbin/nginx","-s","quit"]
In order to achieve a reasonably high level of availability, especially for actively developed applications, it’s important to debug failures quickly. Kubernetes can speed debugging by surfacing causes of fatal errors in a way that can be display using kubectl
or the UI, in addition to general log collection. It is possible to specify a terminationMessagePath
where a container will write its “death rattle”, such as assertion failure messages, stack traces, exceptions, and so on. The default path is /dev/termination-log
.
Here is a toy example:
apiVersion: v1
kind: Pod
metadata:
name: pod-w-message
spec:
containers:
- name: messager
image: "ubuntu:14.04"
command: ["/bin/sh","-c"]
args: ["sleep 60 && /bin/echo Sleep expired > /dev/termination-log"]
The message is recorded along with the other state of the last (i.e., most recent) termination:
$ kubectl create -f ./pod.yaml
pods/pod-w-message
$ sleep 70
$ kubectl get pods/pod-w-message -o go-template="{{range .status.containerStatuses}}{{.lastState.terminated.message}}{{end}}"
Sleep expired
$ kubectl get pods/pod-w-message -o go-template="{{range .status.containerStatuses}}{{.lastState.terminated.exitCode}}{{end}}"
0