Hardware Awareness Scheduling Based on NFD

A proposal for hardware awareness scheduling


Le, Huifeng (intel)

Xu, Di


NFD (Node Feature Discovery) enables node feature discovery for Kubernetes. It detects hardware features available on each node in a Kubernetes cluster, and advertises those features using node labels. The NodeFeatureRule objects provide an easy way to create vendor or application specific labels , including CPU families, Kernel, SR-IOV enabled Network, NUMA architecture, etc. This PR targets to enable Hardware Awareness (e.g. cluster features) for clusternet scheduling.


Provides hardware awareness labels for child clusters, which can be very useful for users to select desired clusters for applications.


  • Discovers cluster features by enabling NFD in child clusters


  • Installs NFD in each child cluster


User Stories (Optional)

  • I want to know the hardware characteristics of child clusters, such as CPU, FPGA etc.
  • I want to deploy my applications to child clusters with required hardware features.

Design Details

    actor admin
    participant HK as kube-apiserver<br>(in hub cluster)
    participant A as clusternet-agent<br>(in child cluster)
    participant EK as kube-apiserver<br>(in child cluster)
    participant NM as nfd-master<br>(in child cluster)

    %% Subscription
    admin->>HK: Day 0:<br>Creates a Subscription<br>for NodeFeatureRule

    %% Cluster Status update
    loop Roll Update
        NM->>EK: Updates Node Label<br>such as<br>"node.clusternet.io/my-HA-feature": "true"
        A->>EK: Collects Cluster Status
        Note over A: Aggregates Common<br>Labels with prefix<br>"node.clusternet.io/"
        A->>HK: Patches ManagerCluster's<br>Labels with ClusterFeatures

    admin->>HK: Day 1:<br>Creates a Subscription<br>with Cluster Feature Labels<br>in clusterAffinity

Before getting started, custom resource NodeFeatureRule need to be created in desired child clusters. We can create a Subscription with adding NodeFeatureRule as a feed. clusternet will help create those rules in child clusters.

nfd-master running in each child cluster will get notified of NodeFeatureRule and start labelling all the nodes in current child cluster. clusternet-agent periodically collects information from current child cluster and aggregates common node labels that have prefix node.clusternet.io/. Those common labels will be patched to ManagedCluster by clusternet-agent.

Below is a sample NodeFeatureRule.

apiVersion: nfd.k8s-sigs.io/v1alpha1
kind: NodeFeatureRule
  name: my-sample-rule-object
    - name: "my sample rule"
        "node.clusternet.io/my-HA-feature": "true"
        - feature: kernel.loadedmodule
            dummy: { op: Exists }
        - feature: kernel.config
            X86: { op: In, value: [ "y" ] }

Other cr samples