Functions cold-start tuning
Joule Cloud Functions cold starts are competitive at default settings (V8 isolate: ~8 ms; Python: ~150 ms; Rust-WASM: ~5 ms; Bun: ~20 ms). When you need every millisecond, these are the levers.
The four phases of a cold start
- Schedule. Router picks a node, account auth checks pass.
- Image hydrate. Function bundle (or container image for non-isolate runtimes) is fetched from registry to the node. Cached across invocations on the same node for ~5 minutes after last use.
- Process / isolate spawn. Runtime initialises, the handler module is loaded and executed for its top-level effects.
- Handler invocation. The actual request hits your code.
The receipt breaks the joules down across these phases: energy.cold_start_j covers phases 1-3, energy.execution_j covers phase 4. Tune phase 2 and 3 first — that's where the wins are.
Phase 2: image / bundle hydrate
For V8 isolate runtimes (Node, Deno, Bun), your bundle is hydrated from a content-addressable cache; first hit on a new node is ~30-80 ms, subsequent invocations < 1 ms.
For container runtimes (Python with dependencies, Rust binary, WASI), the OCI image is pulled to the node. Levers:
- Make the image small. A 50 MB image hydrates in < 200 ms; a 2 GB image is 5+ seconds. Use distroless or
alpinebase images, multi-stage builds, and aggressive layer pruning. - Pin the image SHA in
invisible.hcl. SHA-pinned images cache across node migrations; tag-only images can re-hydrate when the tag is reassigned. - Pre-warm. Set
scale.min = 1to keep one instance hot. Cost: idle joules of one runtime container. For a 256 MB Python runtime that's ~30 J/hour — pennies a month at the per-joule rate.
Phase 3: process spawn + handler init
This is where module imports run. The cardinal rule: do work outside the handler at module load, not inside the handler at request time. But only if that work has setup cost worth keeping warm.
# GOOD — model loads once per process; warm requests reuse it
from transformers import pipeline
pipe = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
def handler(req):
body = req.json()
return pipe(body["text"], candidate_labels=body["labels"])
# BAD — model reloads on every call. Cold start AND warm-call cost.
def handler(req):
from transformers import pipeline
pipe = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
body = req.json()
return pipe(body["text"], candidate_labels=body["labels"])
Module-level lazy import for huge deps
If a dep is only used on some requests, defer its import:
import os
_pdf_extractor = None
def _ensure_pdf():
global _pdf_extractor
if _pdf_extractor is None:
import pypdf # only paid on first PDF request
_pdf_extractor = pypdf
return _pdf_extractor
def handler(req):
if req.path.endswith(".pdf"):
pdf = _ensure_pdf()
...
Concurrency vs cold start
One worker instance handles many concurrent requests (up to per_instance_concurrency, default 1000). A second instance only spawns when concurrency for the first is exhausted — THIS is where you see a second cold start.
invisible fn deploy handler.js \
--runtime node-22 \
--per-instance-concurrency 1000 \
--scale-min 0 \
--scale-max 50
For CPU-bound functions, lower per-instance-concurrency (8-32) and let the platform spawn more instances. For IO-bound (most HTTP / DB / RPC code), keep the default 1000 and stay on one instance.
Provisioned concurrency
For latency-sensitive functions (user-facing APIs), set scale.min to your floor:
scale = {
min = 3 # 3 instances always warm — no cold start under normal load
max = 100
on_metric = "requests_per_sec > 100"
}
Cost: 3 idle instances' joule floor. For a typical 256 MB Node runtime, ~150 J/hour per instance · 3 = 450 J/hour · 24 = ~10.8 kJ/day · ~$0.04/day. Often a justifiable trade for p99 latency.
Snapshot-based fast restore (experimental, Rust + WASI)
For Rust functions compiled to WASI, we support --snapshot-after-init. The runtime takes a memory snapshot after module load; restoring from snapshot is ~0.5 ms vs ~5 ms re-init. Eligible for V8 isolates in a future release.
invisible fn deploy handler.wasm \
--runtime wasi-preview-2 \
--snapshot-after-init
Measuring it
Every invocation's receipt has execution.cold_start (boolean) and energy.cold_start_j. Pull aggregated stats:
jc fn stats <name> --by cold-start --since 24h
# Cold-start rate: 4.2% (314 cold / 7,510 total)
# Cold-start mean: 142 ms, p99: 290 ms
# Warm-start mean: 11 ms, p99: 38 ms
# Cold-start energy avg: 8.4 J, warm avg: 0.31 J
Anti-patterns
- Pulling secrets from a remote source at every request. Cache them at module load.
- Building DB connection pools per request. Build at module load; the pool persists across warm invocations.
- Doing expensive sync work at module load that isn't needed for every request. Lazy-init the part of the world your function might not touch.
- Setting
scale-minto 50 because "to be safe". You're paying for 47 idle workers. Watch the actual concurrent-request stats first.