7 min read

Kubernetes 1.35: In-Place Pod Vertical Scaling Reaches GA

Kubernetes v1.35 brings in-place pod vertical scaling to GA after 6+ years. Adjust CPU/memory without pod recreation - but beware anti-patterns. This comprehensive guide covers proper use cases, alternatives like HPA/VPA, and why this powerful feature should be reserved for specific edge cases only.
Kubernetes 1.35: In-Place Pod Vertical Scaling Reaches GA

Kubernetes v1.35, released on 17 December 2025, brings the long-awaited in-place pod vertical scaling feature to general availability (GA). This feature, officially called 'In-Place Pod Resize', allows administrators to adjust CPU and memory resources for running containers without recreating pods.

The Traditional Approach to Resource Scaling

Before this feature, scaling pod resources required deleting and recreating the pod:

1. Update deployment/pod specification with new resource values
2. Kubernetes terminates the existing pod
3. Scheduler creates a new pod with updated resources
4. Container runtime starts the new pod
5. Application initialises and becomes ready

This approach causes:

  • Dropped network connections
  • Loss of in-memory application state
  • Service disruption during pod restart
  • Extended downtime for applications with long startup times

What is In-Place Pod Vertical Scaling?

In-Place Pod Vertical Scaling enables runtime modification of CPU and memory resource requests and limits without pod recreation. The Kubelet adjusts cgroup limits for running containers, allowing resource changes with minimal disruption.

Development Timeline

  • v1.27 (April 2023): Alpha release
  • v1.33 (May 2025): Beta release
  • v1.35 (December 2025): General availability (stable)

The feature required 6+ years of development to solve complex challenges, including container runtime coordination, scheduler synchronisation, and memory safety guarantees.

Key Features in v1.35

1. Memory Limit Decrease

Previous versions prohibited memory limit decreases due to out-of-memory (OOM) concerns. Version 1.35 implements best-effort memory decrease with safety checks:

  • Kubelet verifies current memory usage is below the new limit
  • Resize fails gracefully if usage exceeds the new limit
  • Not guaranteed to prevent OOM, but significantly safer than forced decrease

Example:

# Previously prohibited, now allowed
resources:
  requests:
    memory: "512Mi"  # Decreased from 1Gi
  limits:
    memory: "1Gi"    # Decreased from 2Gi

2. Prioritised Resize Queue

When node capacity is insufficient for all resize requests, Kubernetes prioritises them by:

  1. PriorityClass value (higher first)
  2. QoS class (Guaranteed > Burstable > BestEffort)
  3. Duration deferred (oldest first)

This ensures critical workloads receive resources before lower-priority pods.

3. Enhanced Observability

New metrics and events improve resize operation tracking:

Kubelet Metrics:

  • pod_resource_resize_requests_total
  • pod_resource_resize_failures_total
  • pod_resource_resize_duration_seconds

Pod Conditions:

  • PodResizePending: Request cannot be immediately granted
  • PodResizeInProgress: Kubelet is applying changes

4. VPA Integration

Vertical Pod Autoscaler (VPA) InPlaceOrRecreate update mode graduated to beta, enabling automatic resource adjustment using in-place resize when possible.

How It Works

Resize Policies

Containers specify restart behaviour for each resource type:

apiVersion: v1
kind: Pod
metadata:
  name: example
spec:
  containers:
  - name: app
    image: nginx:1.27
    resources:
      requests:
        cpu: "500m"
        memory: "512Mi"
      limits:
        cpu: "1000m"
        memory: "1Gi"
    resizePolicy:
    - resourceName: cpu
      restartPolicy: NotRequired      # No restart for CPU changes
    - resourceName: memory
      restartPolicy: RestartContainer  # Restart required for memory

Policy Options:

  • NotRequired: Apply changes without container restart (default for CPU)
  • RestartContainer: Restart container to apply changes (often needed for memory)

Applying Resource Changes

Use the --subresource=resize flag with kubectl (requires kubectl v1.32+):

kubectl patch pod example --subresource=resize --patch '
{
  "spec": {
    "containers": [
      {
        "name": "app",
        "resources": {
          "requests": {"cpu": "800m"},
          "limits": {"cpu": "1600m"}
        }
      }
    ]
  }
}'

Monitoring Resize Status

Check desired vs actual resources:

# Desired (spec)
kubectl get pod example -o jsonpath='{.spec.containers[0].resources}'

# Actual (status)
kubectl get pod example -o jsonpath='{.status.containerStatuses[0].resources}'

View resize conditions:

kubectl get pod example -o jsonpath='{.status.conditions[?(@.type=="PodResizeInProgress")]}'
kubectl describe pod example | grep -A 10 Events

Use Cases

1. Peak Hour Resource Scaling

Scale resources to match daily traffic patterns without service disruption. For example, scale up CPU/memory at 08:55 for business hours, then scale down at 18:05 after hours, eliminating the need for pod recreation.

2. Database Maintenance Operations

Temporarily increase CPU allocation for intensive maintenance tasks like VACUUM or REINDEX operations, then restore normal allocation once complete. This avoids pod restart and preserves warm database caches.

3. JIT Compilation Warmup

Provide additional CPU during application startup for JIT compilation warmup, then reduce allocation once the application reaches steady state. Particularly beneficial for Java applications with large codebases.

4. Cost Optimisation

Dynamically right-size resources to reduce waste by scaling down during low-traffic periods, reducing over-provisioning based on actual usage patterns, and implementing time-based scaling without pod recreation overhead.

Limitations and Constraints

Resource Support:

  • Only CPU and memory can be resized (GPU, ephemeral storage, hugepages remain immutable)
  • Init and ephemeral containers cannot be resized
  • QoS class cannot change during resize

Platform Requirements:

  • Not supported on Windows nodes
  • Requires compatible container runtime (containerd v2.0+, CRI-O v1.25+)
  • Cannot resize with static CPU or Memory manager policies

Important Behaviours:

  • Most runtimes (Java, Python, Node.js) require restart for memory changes
  • Updating Deployment specs does not auto-resize existing pods
  • Cannot remove requests or limits entirely, only modify values

Upgrading to Kubernetes v1.35

Prerequisites

  • Kubernetes v1.34.x cluster
  • kubectl v1.32+ client
  • Compatible container runtime

Upgrade Process

1. Add v1.35 repository to all nodes:

# On each node
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.35/deb/Release.key | \
  sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-1.35-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-1.35-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.35/deb/ /' | \
  sudo tee /etc/apt/sources.list.d/kubernetes-1.35.list

sudo apt update

2. Upgrade control plane:

sudo apt-mark unhold kubeadm
sudo apt install -y kubeadm=1.35.0-1.1
sudo kubeadm upgrade apply v1.35.0 -y
sudo apt-mark unhold kubelet kubectl
sudo apt install -y kubelet=1.35.0-1.1 kubectl=1.35.0-1.1
sudo systemctl restart kubelet

3. Upgrade workers (one at a time):

kubectl drain <worker-name> --ignore-daemonsets --delete-emptydir-data --force

ssh <worker-node>
sudo apt-mark unhold kubeadm kubelet kubectl
sudo apt install -y kubeadm=1.35.0-1.1 kubelet=1.35.0-1.1 kubectl=1.35.0-1.1
sudo kubeadm upgrade node
sudo systemctl restart kubelet

kubectl uncordon <worker-name>

4. Verify upgrade:

kubectl get nodes
# All nodes should show v1.35.0

Testing In-Place Resize

Basic CPU Resize Test

apiVersion: v1
kind: Pod
metadata:
  name: cpu-test
spec:
  containers:
  - name: nginx
    image: nginx:1.27
    resources:
      requests:
        cpu: "250m"
      limits:
        cpu: "500m"
    resizePolicy:
    - resourceName: cpu
      restartPolicy: NotRequired

Create pod and verify baseline:

kubectl apply -f cpu-test.yaml
kubectl get pod cpu-test -o jsonpath='{.status.containerStatuses[0].restartCount}'
# Output: 0

Resize CPU:

kubectl patch pod cpu-test --subresource=resize --patch '
{
  "spec": {
    "containers": [{
      "name": "nginx",
      "resources": {
        "requests": {"cpu": "800m"},
        "limits": {"cpu": "800m"}
      }
    }]
  }
}'

Verify no restart occurred:

kubectl get pod cpu-test -o jsonpath='{.status.containerStatuses[0].restartCount}'
# Output: 0 (unchanged)

kubectl get pod cpu-test -o jsonpath='{.spec.containers[0].resources.requests.cpu}'
# Output: 800m (updated)

Memory Decrease Test

# Create pod with high memory
kubectl apply -f memory-test.yaml

# Increase memory first
kubectl patch pod memory-test --subresource=resize --patch '
{
  "spec": {
    "containers": [{
      "name": "app",
      "resources": {
        "requests": {"memory": "1Gi"},
        "limits": {"memory": "2Gi"}
      }
    }]
  }
}'

# Decrease memory (new in v1.35)
kubectl patch pod memory-test --subresource=resize --patch '
{
  "spec": {
    "containers": [{
      "name": "app",
      "resources": {
        "requests": {"memory": "512Mi"},
        "limits": {"memory": "1Gi"}
      }
    }]
  }
}'

Best Practices

Testing: Verify application behaviour in non-production before enabling in production.

Resize Policies: Use NotRequired for CPU (usually safe), RestartContainer for memory (often requires restart).

Monitoring: Track Kubelet metrics for resize operations, alert on persistent PodResizePending conditions, monitor OOM events after decreases.

Resource Changes: Make incremental adjustments, avoid large jumps, verify current usage before decreasing limits.

Priority: Use PriorityClasses for critical workloads to ensure resize priority during capacity constraints.

Troubleshooting

Resize Rejected (stuck in PodResizePending): Check node capacity (kubectl describe node), verify QoS class compatibility, review kubelet configuration for static resource managers, ensure resize policy allows the change.

Unexpected Restarts: Memory changes often require restart regardless of policy. Use RestartContainer for memory, test application behaviour, monitor OOM events.

OOM After Memory Decrease: Verify current usage first (kubectl top pod), make gradual decreases, ensure application can release memory under pressure, consider using RestartContainer to force release.

Anti-Patterns: When NOT to Use In-Place Resize

Important: In-place pod resize is a powerful feature that can become an anti-pattern if misused. It should be reserved for specific edge cases, not used as a general scaling strategy.

Why It Can Be an Anti-Pattern

1. Violates Immutable Infrastructure: Kubernetes follows "cattle not pets" philosophy with ephemeral, replaceable pods. In-place resize makes pods mutable and long-lived, contradicting this design principle and reducing system resilience.

2. Breaks GitOps: Manual pod patching creates drift between Git repository and cluster reality. Next deployment overwrites manual changes, cluster state doesn't match version control, and you cannot reproduce environments from Git.

3. Configuration Drift: Pods in the same Deployment can have different resource allocations, causing inconsistent behaviour across replicas, difficult debugging, and unpredictable load distribution.

4. Reduces Failure Isolation: Encourages keeping pods alive longer rather than quick replacement, which can hide underlying issues like memory leaks and delay necessary restarts.

5. Better Alternatives Exist: Traffic spikes should use HPA (scale out), resource changes should update Deployments (declarative), and environment differences should use Kustomize/Helm.

Some Valid Use Cases

1. Stateful Applications with Expensive Startup: Databases with 15+ minute startup times (cache warming, index loading). Example: Scale up CPU for PostgreSQL VACUUM/ANALYZE operations, then scale down. Justified because recreation cost exceeds configuration drift cost.

2. Single-Instance Workloads: Legacy applications with shared file locks, non-distributed state, or connection pooling limitations. Justified because horizontal scaling isn't possible without full rewrite.

3. Emergency Production Incidents: Immediate relief for OOM conditions at 3am when proper solutions require hours of testing. Must be temporary (fix properly within 24 hours) and documented.

4. Predictable Temporary Spikes: Monthly report generation requiring 4x CPU for 2 hours. Automate with CronJobs to scale up/down on schedule. Justified because running 4x resources 24/7 wastes 96% of allocation.

5. Testing and Capacity Planning: Non-production experimentation to measure performance at different resource levels and determine optimal production sizing.

Future Enhancements

Kubernetes SIG-Node is working on:

  • Removing static CPU/Memory manager restrictions
  • Support for additional resource types
  • Improved safety for memory decreases (runtime-level checks)
  • Resource pre-emption (evict lower-priority pods for high-priority resizes)
  • Better integration with horizontal autoscaling

Conclusion

In-Place Pod Vertical Scaling addresses a long-standing Kubernetes limitation, enabling resource adjustments without service disruption. The v1.35 GA release brings production-ready capability with important improvements including memory decrease support, prioritised queuing, and enhanced observability.

While this feature enables dynamic resource management for stateful workloads and emergency scenarios, it should complement rather than replace traditional scaling patterns. Use HPA for traffic-based scaling, update Deployments for permanent changes, and reserve in-place resize for the specific edge cases where pod recreation cost genuinely exceeds operational complexity.

References