Migrate from OpenAI

Joule Cloud's inference API is a drop-in replacement for OpenAI's. Two lines change: the base_url and the api_key. The rest of your integration keeps working — same routes, same request shapes, same response shapes.

The diff

from openai import OpenAI

- client = OpenAI()  # reads OPENAI_API_KEY from env
+ client = OpenAI(
+     base_url="https://api.greenjoules.cloud/v1",
+     api_key="jc_…"  # from portal.greenjoules.cloud
+ )

  r = client.chat.completions.create(
-     model="gpt-4o-mini",
+     model="auto",  # or pin: "llama-3.3-70b-instruct"
      messages=[{"role":"user","content":"hi"}],
  )

That's it. The response object has the same fields. r.choices[0].message.content works. Streaming works. Tool use works. JSON mode works.

What's the same

CapabilityJoule Cloud
Chat completionsPOST /v1/chat/completions — identical shape
Streaming (SSE)Set stream: true, parse data: lines, same delta shape
Tool / function callingtools array on request, tool_calls on response — identical
JSON moderesponse_format: {"type": "json_object"}
EmbeddingsPOST /v1/embeddings
Vision (image input)content array with image_url parts
Model listingGET /v1/models

What's different

CapabilityHow it differs
Model namesOpenAI models are not on the mesh. Use model: "auto" for the cheapest capable choice, or pin one of: llama-3.3-70b-instruct, llama-3.3-405b-instruct, mixtral-8x22b, qwen2.5-72b, deepseek-v3, claude-haiku-4.5 (where licensed), gpt-oss-120b.
Energy headerEvery response includes X-Energy-Joules and a sibling set of X-Carbon-mg, X-Routed-To, X-Tier, X-Receipt-Id. OpenAI does not.
Pricing modelBilled in joules consumed, not per-token. See Pricing for orientation.
Rate limitsNo per-minute token caps. Your only ceiling is your account balance (or an explicit per-workload energy_budget).
Region / data residencyYou can pin a jurisdiction via request header X-Region: eu-fi. Default behaviour is "cheapest capable" worldwide.

Migration checklist

  1. Sign up at portal.greenjoules.cloud and put $5 on file.
  2. Mint a token in the portal under Tokens → New token. Scope to inference-only if you don't need the full surface.
  3. Swap your client configbase_url and api_key as above.
  4. Pick a model. Start with "auto" and let the router classify. Switch to a pinned model if you need deterministic behaviour.
  5. Run your existing test suite against the new endpoint. Most regressions you'll find are around model-specific output style, not API shape.
  6. Capture the joule headers in your observability stack. X-Energy-Joules, X-Tier, X-Routed-To — these are the data you don't have today.

Patterns by SDK

Python (openai)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.greenjoules.cloud/v1",
    api_key=os.environ["JC_API_KEY"],
)

r = client.chat.completions.create(model="auto", messages=[...])
joules = r.response.headers.get("x-energy-joules")

Node (openai npm)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.greenjoules.cloud/v1",
  apiKey: process.env.JC_API_KEY,
});

const r = await client.chat.completions.create({ model: "auto", messages: [...] });
// r._request_id, etc.
// energy is on the underlying response object — use the lower-level fetch
//   approach below if you need the X-Energy-Joules header

Vercel AI SDK

import { createOpenAI } from "@ai-sdk/openai";

const jc = createOpenAI({
  baseURL: "https://api.greenjoules.cloud/v1",
  apiKey: process.env.JC_API_KEY,
});

const { text } = await generateText({
  model: jc("auto"),
  prompt: "hi",
});

LangChain (Python)

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.greenjoules.cloud/v1",
    api_key=os.environ["JC_API_KEY"],
    model="auto",
)

curl

curl https://api.greenjoules.cloud/v1/chat/completions \
  -H "Authorization: Bearer $JC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"hi"}]}'

Rollback

Both endpoints can run in parallel. Most teams keep the OpenAI client as a fallback for a week or two, behind a feature flag, then retire it. To roll back, change the base_url and api_key back. The application code does not need to change.

Common gotchas

Next

For the full API surface, see the API reference. To understand what your bill is going to look like, see Pricing and What is a joule, here.