Intermediate4 min readcompute.vcpus

What is a Millicore?

A millicore is 1/1000th of a CPU core. It's how Kubernetes measures CPU allocation — 1000m = 1 full vCPU. Essential for container-based GPU workloads.

What it is

A millicore (abbreviated m) is Kubernetes' unit for measuring CPU allocation. It's 1/1000th of a CPU core.

Think of it like currency. If a vCPU is a dollar, a millicore is a tenth of a cent. This granularity lets Kubernetes pack workloads efficiently — a small web server might need only 100m (10% of a core), while a data preprocessing job might need 4000m (4 full cores).

# Kubernetes pod spec
resources:
  requests:
    cpu: "4000m"      # 4 vCPUs
    memory: "16Gi"
    nvidia.com/gpu: 1  # 1 GPU
  limits:
    cpu: "8000m"      # burst up to 8 vCPUs
    memory: "32Gi"
    nvidia.com/gpu: 1

Why it matters for GPU workloads

Many GPU cloud providers (CoreWeave, RunPod, Lambda) run Kubernetes under the hood. When you deploy a GPU workload as a container, you specify CPU in millicores alongside your GPU request.

Getting this wrong has real consequences:

  • Too few millicores → CPU can't feed data to the GPU fast enough → GPU sits idle → you're paying for unused compute
  • Too many millicores → you're reserving CPU you don't need → higher cost, potentially blocking other workloads
Common millicore allocations for GPU pods
Inference (single model): 2000m – 4000m
Training (data-heavy): 8000m – 16000m
Multi-GPU training: 16000m – 32000m
Lightweight serving: 500m – 1000m

Quick conversion table

MillicoresvCPUsDescription
100m0.1Minimal — sidecar containers, monitoring agents
250m0.25Light — small API servers
500m0.5Medium — web servers, light processing
1000m1One full core
4000m4Standard GPU inference pod
8000m8Training with moderate data pipeline
16000m16Heavy training, complex preprocessing

How it relates to GIS

GIS uses compute.vcpus (whole vCPUs), not millicores. To convert:

millicores = compute.vcpus × 1000

// Lambda H100: 26 vCPUs = 26,000m available
// AWS p5.48xlarge: 192 vCPUs = 192,000m available

When deploying on Kubernetes-based providers, you'll request a fraction of the instance's total vCPUs in millicores for your pod.

Key takeaways
  • ·1 millicore (1m) = 1/1000th of a CPU core
  • ·1000m = 1 vCPU = 1 full core
  • ·Used in Kubernetes resource requests and limits
  • ·500m means your container gets half a CPU core
  • ·GPU pods typically request 4000m-16000m (4-16 cores) alongside GPU resources