Migrate from Modal

Modal's sweet spot is Python ML and batch workloads: decorate a function, run it on a GPU. Joule Cloud doesn't use Python decorators — we're container/runtime-agnostic — but the workloads themselves translate cleanly.

Two patterns

Modal users fall into two camps:

Inference endpoints — serving a custom or open-weight model via an HTTP API. Move to Inference (if it's an open-weight model already on the router), or to Functions / Compute if you're serving a custom model with custom logic.
Batch / fine-tuning — longer-running GPU jobs. Move to Compute with a GPU workload (contact us for raw-GPU access at launch).

Inference endpoints

# Modal
- import modal
- app = modal.App("my-classifier")
-
- @app.function(gpu="A10G", image=modal.Image.debian_slim().pip_install("torch", "transformers"))
- @modal.web_endpoint(method="POST")
- def classify(payload: dict):
-     from transformers import pipeline
-     pipe = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
-     return pipe(payload["text"], candidate_labels=payload["labels"])

# Joule Cloud (Functions)
+ // classify.py — Python function on Joule Cloud
+ from transformers import pipeline
+
+ pipe = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
+
+ def handler(req):
+     body = req.json()
+     return pipe(body["text"], candidate_labels=body["labels"])

+ # deploy
+ invisible fn deploy classify.py \
+   --route /classify \
+   --runtime python-3.13 \
+   --memory 2GB \
+   --gpu a10g

Batch jobs (fine-tunes, evals)

Modal's app.run() + @function(gpu="H100") for a one-off → Joule Cloud Compute with a workload running your script:

workload "finetune-job" {
  image  = "ghcr.io/me/finetuner:latest"
  region = "eu-fi"            # cleanest grid for long-running GPU
  gpu    = "h100"
  cpu    = "16 vCPU"
  memory = "128 GB"
  cmd    = ["python", "train.py", "--config", "configs/v3.yaml"]
  scale  = { min = 0, max = 1, on_completion = "delete" }
}

Volumes

Modal volumes (modal.Volume) for caching weights / datasets become Object Store buckets. Reads from a workload to a same-region bucket are free on the mesh, so the latency overhead of "remote" storage is minimal in practice.

What you keep

Container-based runtime (Modal builds containers under the hood; we use them directly)
GPU access for inference + training
Web endpoint + cron-style triggers

What you gain

Energy-priced GPU minutes — you pay for the joules the GPU drew, not for the GPU-second rounded up
Mixed-language workloads — Modal is Python-centric; we run anything
Per-job energy + carbon receipt — what regulators will ask for

What's on the roadmap

A Modal-style Python decorator SDK that wraps invisible.hcl + Functions deploy. Useful if you have hundreds of Modal-decorated functions and don't want to rewrite them as containers individually. Reach out if you'd use it.