Migrate from Modal
Modal's sweet spot is Python ML and batch workloads: decorate a function, run it on a GPU. Joule Cloud doesn't use Python decorators — we're container/runtime-agnostic — but the workloads themselves translate cleanly.
Two patterns
Modal users fall into two camps:
- Inference endpoints — serving a custom or open-weight model via an HTTP API. Move to Inference (if it's an open-weight model already on the router), or to Functions / Compute if you're serving a custom model with custom logic.
- Batch / fine-tuning — longer-running GPU jobs. Move to Compute with a GPU workload (contact us for raw-GPU access at launch).
Inference endpoints
# Modal
- import modal
- app = modal.App("my-classifier")
-
- @app.function(gpu="A10G", image=modal.Image.debian_slim().pip_install("torch", "transformers"))
- @modal.web_endpoint(method="POST")
- def classify(payload: dict):
- from transformers import pipeline
- pipe = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
- return pipe(payload["text"], candidate_labels=payload["labels"])
# Joule Cloud (Functions)
+ // classify.py — Python function on Joule Cloud
+ from transformers import pipeline
+
+ pipe = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
+
+ def handler(req):
+ body = req.json()
+ return pipe(body["text"], candidate_labels=body["labels"])
+ # deploy
+ invisible fn deploy classify.py \
+ --route /classify \
+ --runtime python-3.13 \
+ --memory 2GB \
+ --gpu a10g
Batch jobs (fine-tunes, evals)
Modal's app.run() + @function(gpu="H100") for a one-off → Joule Cloud Compute with a workload running your script:
workload "finetune-job" {
image = "ghcr.io/me/finetuner:latest"
region = "eu-fi" # cleanest grid for long-running GPU
gpu = "h100"
cpu = "16 vCPU"
memory = "128 GB"
cmd = ["python", "train.py", "--config", "configs/v3.yaml"]
scale = { min = 0, max = 1, on_completion = "delete" }
}
Volumes
Modal volumes (modal.Volume) for caching weights / datasets become Object Store buckets. Reads from a workload to a same-region bucket are free on the mesh, so the latency overhead of "remote" storage is minimal in practice.
What you keep
- Container-based runtime (Modal builds containers under the hood; we use them directly)
- GPU access for inference + training
- Web endpoint + cron-style triggers
What you gain
- Energy-priced GPU minutes — you pay for the joules the GPU drew, not for the GPU-second rounded up
- Mixed-language workloads — Modal is Python-centric; we run anything
- Per-job energy + carbon receipt — what regulators will ask for
What's on the roadmap
A Modal-style Python decorator SDK that wraps invisible.hcl + Functions deploy. Useful if you have hundreds of Modal-decorated functions and don't want to rewrite them as containers individually. Reach out if you'd use it.