Run Plain Pods

Run a single Pod, or a group of Pods as a Kueue-managed job.

This page shows how to leverage Kueue’s scheduling and resource management capabilities when running plain Pods. Kueue supports management of both individual Pods, or Pod groups.

This guide is for batch users that have a basic understanding of Kueue. For more information, see Kueue’s overview.

Before you begin

  1. By default, the integration for v1/pod is not enabled. Learn how to install Kueue with a custom manager configuration and enable the pod integration.

    A configuration for Kueue with enabled pod integration would look like follows:

    apiVersion: config.kueue.x-k8s.io/v1beta1
    kind: Configuration
    integrations:
      frameworks:
       - "pod"
      podOptions:
        # You can change namespaceSelector to define in which 
        # namespaces kueue will manage the pods.
        namespaceSelector:
          matchExpressions:
          - key: kubernetes.io/metadata.name
            operator: NotIn
            values: [ kube-system, kueue-system ]
        # Kueue uses podSelector to manage pods with particular 
        # labels. The default podSelector will match all the pods. 
        podSelector:
          matchExpressions:
          - key: kueue-job
            operator: In
            values: [ "true", "True", "yes" ]
    
  2. Kueue will run webhooks for all created pods if the pod integration is enabled. The webhook namespaceSelector could be used to filter the pods to reconcile. The default webhook namespaceSelector is:

    matchExpressions:
    - key: kubernetes.io/metadata.name
      operator: NotIn
      values: [ kube-system, kueue-system ]
    

    When you install Kueue via Helm, the webhook namespace selector will match the integrations.podOptions.namespaceSelector in the values.yaml.

    Make sure that namespaceSelector never matches the kueue namespace, otherwise the Kueue deployment won’t be able to create Pods.

  3. Pods that belong to other API resources managed by Kueue are excluded from being queued by pod integration. For example, pods managed by batch/v1.Job won’t be managed by pod integration.

  4. Check Administer cluster quotas for details on the initial Kueue setup.

Running a single Pod admitted by Kueue

When running Pods on Kueue, take into consideration the following aspects:

a. Queue selection

The target local queue should be specified in the metadata.labels section of the Pod configuration.

metadata:
  labels:
    kueue.x-k8s.io/queue-name: user-queue

b. Configure the resource needs

The resource needs of the workload can be configured in the spec.containers.

    - resources:
        requests:
          cpu: 3

c. The “managed” label

Kueue will inject the kueue.x-k8s.io/managed=true label to indicate which pods are managed by it.

d. Limitations

  • A Kueue managed Pod cannot be created in kube-system or kueue-system namespaces.
  • In case of preemption, the Pod will be terminated and deleted.

Example Pod

Here is a sample Pod that just sleeps for a few seconds:

apiVersion: v1
kind: Pod
metadata:
  generateName: kueue-sleep-
  labels:
    kueue.x-k8s.io/queue-name: user-queue
spec:
  containers:
    - name: sleep
      image: busybox
      command:
        - sleep
      args:
        - 3s
      resources:
        requests:
          cpu: 3
  restartPolicy: OnFailure

You can create the Pod using the following command:

# Create the pod
kubectl apply -f kueue-pod.yaml

Running a group of Pods to be admitted together

In order to run a set of Pods as a single unit, called Pod group, add the “pod-group-name” label, and the “pod-group-total-count” annotation to all members of the group, consistently:

metadata:
  labels:
    kueue.x-k8s.io/pod-group-name: "group-name"
  annotations:
    kueue.x-k8s.io/pod-group-total-count: "2"

Feature limitations

Kueue provides only the minimal required functionality of running Pod groups, just for the need of environments where the Pods are managed by external controllers directly, without a Job-level CRD.

As a consequence of this design decision, Kueue does not re-implement core functionalities that are available in the Kubernetes Job API, such as advanced retry policies. In particular, Kueue does not re-create failed Pods.

This design choice impacts the scenario of preemption. When a Kueue needs to preempt a workload that represents a Pod group, kueue sends delete requests for all of the Pods in the group. It is the responsibility of the user or controller that created the original Pods to create replacement Pods.

NOTE: We recommend using the kubernetes Job API or similar CRDs such as JobSet, MPIJob, RayJob (see more here).

Termination

Kueue considers a Pod group as successful, and marks the associated Workload as finished, when the number of succeeded Pods equals the Pod group size.

If a Pod group is not successful, there are two ways you may want to use to terminate execution of a Pod group to free the reserved resources:

  1. Issue a Delete request for the Workload object. Kueue will terminate all remaining Pods.
  2. Set the kueue.x-k8s.io/retriable-in-group: false annotation on at least one Pod in the group (can be a replacement Pod). Kueue will mark the workload as finished once all Pods are terminated.

Example Pod group

Here is a sample Pod group that just sleeps for a few seconds:

---
apiVersion: v1
kind: Pod
metadata:
  generateName: sample-leader-
  labels:
    kueue.x-k8s.io/queue-name: user-queue
    kueue.x-k8s.io/pod-group-name: "sample-group"
  annotations:
    kueue.x-k8s.io/pod-group-total-count: "2"
spec:
  restartPolicy: Never
  containers:
  - name: sleep
    image: busybox
    command: ["sh", "-c", 'echo "hello world from the leader pod" && sleep 3']
    resources:
      requests:
        cpu: 3
---
apiVersion: v1
kind: Pod
metadata:
  generateName: sample-worker-
  labels:
    kueue.x-k8s.io/queue-name: user-queue
    kueue.x-k8s.io/pod-group-name: "sample-group"
  annotations:
    kueue.x-k8s.io/pod-group-total-count: "2"
spec:
  restartPolicy: Never
  containers:
  - name: sleep
    image: busybox
    command: ["sh", "-c", 'echo "hello world from the worker pod" && sleep 2']
    resources:
      requests:
        cpu: 3

You can create the Pod group using the following command:

kubectl apply -f kueue-pod-group.yaml

The name of the associated Workload created by Kueue equals the name of the Pod group. In this example it is sample-group, you can inspect the workload using:

kubectl describe workload/sample-group