Pod & Container: Probes

In Kubernetes, a probe is a mechanism used to determine the health and readiness of a container running within a pod. Probes defined in the pod specification and performed periodically to make sure that the containers inside a pod running properly.

Probe Types

Kubernetes provides three types of probes to monitor and manage the health of our containers.

Liveness Probe

This probe checks whether a container is alive (still running properly). If it fails, Kubernetes restarts the container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.

Readiness Probe

This probe checks whether a container is ready to accept traffic. If it fails, Kubernetes removes the pod from the Service's endpoints. For example, an application need time to reload the configuration and became temporary unavailable. In such cases, you don't want to kill the application, but you don't want to send it requests either.

Startup Probe

This probe checks whether a slow-starting container has fully started. It gives the container extra time to fully start before liveness or readiness checks begin. If it fails, Kubernetes restarts the container.For example, an application takes 90 seconds to start. A startup probe prevents Kubernetes from thinking it's "dead" during that time.

Ways to Define Probe Status

And for each probe there are three ways to define a probe’s status:

Command Execution (`exec`)

Runs a command inside the container.

exec:
  command:
    - cat
    - /tmp/ready
initialDelaySeconds: 5
periodSeconds: 10

Success: Exit code 0.
Failure: Non-zero exit code.

HTTP Request (`httpGet`)

Sends an HTTP GET request to a specific endpoint inside the container.

httpGet:
  path: /health
  port: 8080
initialDelaySeconds: 5
periodSeconds: 10

Success: HTTP status 200-399.
Failure: Any other status code.

TCP Socket (`tcpSocket`)

Opens a TCP connection on a specific port.

tcpSocket:
  port: 5432
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 30

Success: Port is open.
Failure: Port is closed.

Example

Lets add a liveness probe and ready probe for our simple-go app. For readiness probe we will use the root / path endpoint. Edit the deployment.yaml file and add this

readinessProbe:
  httpGet:
    path: /
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 10

For liveness probe we will create a new endpoint /health. This endpoint will check a variable status if true then it will return a response with HTTP Code 200, and if false will return a response with HTTP Code 500.

Edit the main.go file to add the new /health endpoint.

var status = true

srv.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
    if status {
        w.WriteHeader(http.StatusOK)
    } else {
        w.WriteHeader(http.StatusInternalServerError)
    }
})

Then add liveness probe configuration in the deployment.yaml file.

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 10

Rebuild the apps using docker and then re-apply the deployment.yaml file using kubectl apply.

➜ docker build --tag simple-go .
[+] Building 13.1s (13/13) FINISHED 
...

➜ kubectl apply -f deployment.yaml
deployment.apps/simple-go configured

Your apps should be working with liveness probe and readiness probe.

Simulate Liveness Failure

Lets edit again our app's /health endpoint. This time we add a counter if the counter is above threshold the it will return a response with HTTP code 500.

var counter = 0

srv.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
    counter++
    if counter > 10 {
        status = false
    }

    if status {
        w.WriteHeader(http.StatusOK)
    } else {
        w.WriteHeader(http.StatusInternalServerError)
    }
})

Rebuild again the apps using docker and then restart your deployment. After few minutes, check the pods list and you should see the pods restarting multiple times.

➜ kubectl get pods
NAME                         READY   STATUS    RESTARTS        AGE
simple-go-75ffcc845f-4xzsc   1/1     Running   4 (11s ago)     8m52s
simple-go-75ffcc845f-5hzl5   1/1     Running   4 (22s ago)     9m2s
simple-go-75ffcc845f-khnsz   0/1     Running   4 (1s ago)      8m41s

And if you check the events you will see the pods is Killed because of liveness probe failure.

➜ kubectl events | grep probe
2m56s (x3 over 7m16s)   Normal    Killing                   Pod/simple-go-75ffcc845f-5hzl5    Container server failed liveness probe, will be restarted
2m45s (x3 over 7m5s)    Normal    Killing                   Pod/simple-go-75ffcc845f-4xzsc    Container server failed liveness probe, will be restarted
2m35s (x3 over 6m55s)   Normal    Killing                   Pod/simple-go-75ffcc845f-khnsz    Container server failed liveness probe, will be restarted
66s (x10 over 7m36s)    Warning   Unhealthy                 Pod/simple-go-75ffcc845f-5hzl5    Liveness probe failed: HTTP probe failed with statuscode: 500
45s (x11 over 7m25s)    Warning   Unhealthy                 Pod/simple-go-75ffcc845f-4xzsc    Liveness probe failed: HTTP probe failed with statuscode: 500
35s (x11 over 7m15s)    Warning   Unhealthy                 Pod/simple-go-75ffcc845f-khnsz    Liveness probe failed: HTTP probe failed with statuscode: 500

Probe Types​

Liveness Probe​

Readiness Probe​

Startup Probe​

Ways to Define Probe Status​

Command Execution (exec)​

HTTP Request (httpGet)​

TCP Socket (tcpSocket)​

Example​

Simulate Liveness Failure​

References​