Kubernetes Policy-as-Code with Kyverno

The Problem with "Just Tell Them Not To"

Every Kubernetes cluster I have managed eventually hits the same issue: someone deploys a container running as root, someone else creates a service without resource limits, and a third person pushes an image from Docker Hub straight into production with no vulnerability scan.

You can write a wiki page with best practices. You can send Slack reminders. None of it works at scale. People forget, people are busy, and people onboard without reading the wiki.

The fix is to encode your rules directly into the cluster. Policy-as-Code means the cluster rejects bad configurations at admission time, before anything gets created. The developer gets immediate feedback, the security team does not need to manually review every deployment, and your compliance posture is consistent across every namespace.

Why Kyverno

There are a few tools in this space. OPA Gatekeeper is the original, and it works, but it requires learning Rego - a policy language most teams do not want to invest in. Kyverno takes a different approach: policies are native Kubernetes resources written in YAML. If your team can write a Kubernetes manifest, they can write a Kyverno policy.

Kyverno does three things:

Validate - reject resources that break rules (e.g. no latest tags, must have resource limits)
Mutate - automatically fix resources on admission (e.g. inject labels, add default resource requests)
Generate - create companion resources when something is deployed (e.g. auto-create NetworkPolicies)

Installing Kyverno

Helm is the fastest path:

Terminal

helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update

helm install kyverno kyverno/kyverno \
  --namespace kyverno \
  --create-namespace \
  --version 3.3.4 \
  --set replicaCount=3 \
  --set backgroundController.enabled=true

Three replicas for HA. The background controller lets Kyverno scan existing resources against new policies, not just new admissions.

Verify it is running:

Terminal

kubectl get pods -n kyverno

Validation Policies

Block Latest Tag

The most common starting point. This stops anyone from deploying an image with :latest or no tag at all:

YAML

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-latest-tag
  annotations:
    policies.kyverno.io/title: Disallow Latest Tag
    policies.kyverno.io/category: Best Practices
    policies.kyverno.io/severity: medium
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: validate-image-tag
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "Using ':latest' or no tag is not allowed. Pin a specific version."
        pattern:
          spec:
            containers:
              - image: "*:*"
            initContainers:
              - image: "*:*"

When validationFailureAction is set to Enforce, Kyverno rejects the resource. Set it to Audit first to see what would fail without breaking anything.

Require Resource Limits

No deployment should run without CPU and memory limits. Without them, a single pod can consume all node resources and starve everything else:

YAML

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-limits
      match:
        any:
          - resources:
              kinds:
                - Pod
      validate:
        message: "CPU and memory limits are required for all containers."
        pattern:
          spec:
            containers:
              - resources:
                  limits:
                    memory: "?*"
                    cpu: "?*"

Require Labels

Enforce standard labels for cost tracking, ownership, and observability:

YAML

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-labels
spec:
  validationFailureAction: Enforce
  background: true
  rules:
    - name: check-required-labels
      match:
        any:
          - resources:
              kinds:
                - Deployment
                - StatefulSet
                - DaemonSet
      validate:
        message: "Labels 'app.kubernetes.io/name', 'app.kubernetes.io/team', and 'app.kubernetes.io/env' are required."
        pattern:
          metadata:
            labels:
              app.kubernetes.io/name: "?*"
              app.kubernetes.io/team: "?*"
              app.kubernetes.io/env: "?*"

Mutation Policies

Mutation policies modify resources on the way in. This is powerful for enforcing defaults without making developers change their manifests.

Auto-inject Team Labels

If a namespace has a team annotation, automatically add it as a label to every pod in that namespace:

YAML

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-team-label
spec:
  background: false
  rules:
    - name: inject-team-from-namespace
      match:
        any:
          - resources:
              kinds:
                - Pod
      mutate:
        patchStrategicMerge:
          metadata:
            labels:
              app.kubernetes.io/team: "{{ request.namespace | namespace_label(@, 'team') }}"

Set Default Resource Requests

If a container has no resource requests, inject sensible defaults so the scheduler can make good placement decisions:

YAML

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: default-resource-requests
spec:
  background: false
  rules:
    - name: set-default-requests
      match:
        any:
          - resources:
              kinds:
                - Pod
      mutate:
        patchStrategicMerge:
          spec:
            containers:
              - (name): "*"
                resources:
                  requests:
                    +(memory): "128Mi"
                    +(cpu): "50m"

The +() syntax means "add only if not already set." It will not override anything the developer has explicitly defined.

Generation Policies

Generation policies create new resources automatically when certain conditions are met.

Auto-create NetworkPolicy

When a new namespace is created, automatically create a default-deny NetworkPolicy:

YAML

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: generate-default-networkpolicy
spec:
  background: false
  rules:
    - name: default-deny-ingress
      match:
        any:
          - resources:
              kinds:
                - Namespace
      generate:
        synchronize: true
        apiVersion: networking.k8s.io/v1
        kind: NetworkPolicy
        name: default-deny-ingress
        namespace: "{{ request.object.metadata.name }}"
        data:
          spec:
            podSelector: {}
            policyTypes:
              - Ingress

With synchronize: true, if someone deletes the NetworkPolicy, Kyverno recreates it. The policy stays in place.

Rollout Strategy

Do not turn on Enforce mode across the cluster on day one. Here is the rollout I use:

Phase 1: Audit Mode (Week 1-2)

Set all policies to Audit mode. This logs violations without blocking anything:

YAML

spec:
  validationFailureAction: Audit

Check what would fail:

Terminal

kubectl get policyreport -A --no-headers | awk '{print $1, $3, $4}'

Phase 2: Enforce on Non-Prod (Week 3-4)

Use match conditions to enforce only in dev and staging namespaces:

YAML

rules:
  - name: validate-image-tag
    match:
      any:
        - resources:
            kinds:
              - Pod
            namespaceSelector:
              matchExpressions:
                - key: env
                  operator: In
                  values:
                    - dev
                    - staging

Phase 3: Enforce Everywhere (Week 5+)

Remove the namespace selector. Full enforcement. By this point teams have had weeks to fix their configurations.

Exceptions

Some workloads genuinely need to break rules (init containers running as root for filesystem setup, etc). Use exclude blocks:

YAML

rules:
  - name: validate-image-tag
    exclude:
      any:
        - resources:
            namespaces:
              - kube-system
              - kyverno

CI Pipeline Integration

Shift left by validating manifests in CI before they reach the cluster. The Kyverno CLI does this:

Terminal

# Install the CLI
brew install kyverno

# Test manifests against policies locally
kyverno apply ./policies/ --resource ./manifests/deployment.yaml

Add this to your GitHub Actions pipeline:

YAML

# .github/workflows/policy-check.yml
name: Policy Check
on: [pull_request]

jobs:
  kyverno-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install Kyverno CLI
        run: |
          curl -LO https://github.com/kyverno/kyverno/releases/download/v1.12.0/kyverno-cli_v1.12.0_linux_x86_64.tar.gz
          tar -xzf kyverno-cli_v1.12.0_linux_x86_64.tar.gz
          sudo mv kyverno /usr/local/bin/

      - name: Validate manifests
        run: |
          kyverno apply ./policies/ --resource ./k8s/ -o json | tee results.json
          FAIL_COUNT=$(jq '[.results[] | select(.result == "fail")] | length' results.json)
          if [ "$FAIL_COUNT" -gt 0 ]; then
            echo "Policy violations found"
            exit 1
          fi

Now policy violations fail the PR before anyone needs to review them.

Monitoring

Kyverno generates PolicyReport resources that work with standard Kubernetes tooling:

Terminal

# View all violations across the cluster
kubectl get polr -A -o wide

# Count violations by policy
kubectl get polr -A -o json | \
  jq -r '.items[].results[]? | select(.result=="fail") | .policy' | \
  sort | uniq -c | sort -rn

For dashboards, Kyverno exposes Prometheus metrics. Add these to your Grafana:

kyverno_admission_review_duration_seconds - latency of admission reviews
kyverno_policy_results_total - count of pass/fail/warn/error by policy
kyverno_admission_requests_total - total admission requests

Common Pitfalls

1. Enforcing before auditing. You will break existing workloads. Always start in Audit mode and review PolicyReports before switching to Enforce.

2. Not excluding system namespaces. Kyverno should not validate kube-system, kyverno, or other infrastructure namespaces. Their workloads have different requirements.

3. Forgetting init containers. A policy that validates containers but ignores initContainers and ephemeralContainers has a gap. Cover all three.

4. No exception process. Some workloads legitimately need to break rules. Build an exception mechanism using exclude blocks or policy exceptions from day one. Without it, teams will push back on the entire system.

5. Too many policies at once. Start with 3-5 high-impact policies (latest tag, resource limits, required labels). Add more as the team builds confidence.

Summary

Kyverno gives you governance without Rego, enforcement without manual reviews, and compliance evidence via PolicyReports. Start with Audit mode, graduate to Enforce, and integrate into CI so violations never reach the cluster. Three weeks from install to full enforcement if you follow the phased rollout.

Kubernetes Policy-as-Code with Kyverno

The Problem with "Just Tell Them Not To"

Why Kyverno

Installing Kyverno

Validation Policies

Block Latest Tag

Require Resource Limits

Require Labels

Mutation Policies

Auto-inject Team Labels

Set Default Resource Requests

Generation Policies

Auto-create NetworkPolicy

Rollout Strategy

Phase 1: Audit Mode (Week 1-2)

Phase 2: Enforce on Non-Prod (Week 3-4)

Phase 3: Enforce Everywhere (Week 5+)

Exceptions

CI Pipeline Integration

Monitoring

Common Pitfalls

Summary

Related Topics

Share this article

Comments & Discussion