What is a joule, here

A joule is one watt-second — the SI unit of energy. On Joule Cloud, the joule is the unit of account. You ship work; we measure the energy that work actually consumed at the silicon; the bill is the sum of those measurements.

You can't optimize what you can't see. You can't price honestly what you can't measure. The joule is what physics charges for computation.

How we measure it

Every request and every workload run is wrapped by a thin middleware layer that reads the silicon's own hardware counters at request start and end. The deltas are the joules.

Silicon	Counter	Precision
Intel / AMD x86	Intel RAPL (per-socket package + DRAM)	~5%
NVIDIA GPU	NVML (per-device power.draw, integrated)	~10%
Apple Silicon	IOReport (CPU + GPU + Neural Engine)	~3%
Anything else	Util × published TDP, declared as derived	~20-30%

Which method we used appears on every response in the X-Energy-Method header, and on the receipt as energy.method + energy.method_confidence. An auditor can always tell whether a number came from the chip itself or was derived from utilization.

Why joules instead of GPU-hours or tokens

Token-based and hour-based billing both ignore reality. A 30-second request on an idle H100 costs the provider one thirtieth of an hour's reservation but ate ~25 watt-hours of energy. A token-priced call that internally runs reasoning may consume 100× the energy of a token-priced call that hits the cache. The customer pays the same in both cases. We don't think they should.

Joules make billing track physics:

Idle is near-zero. A waiting container draws idle-floor watts, not reservation watts.
Inefficient code shows up. A quadratic loop on a million rows costs more in joules than the linear version — even if both finish in the same wall-clock.
Architecture matters. An H100 doing matmul at 700 W charges 700 W. A Graviton doing the same matmul at 90 W charges 90 W. The router puts your call on the chip whose energy profile matches your call's needs — see Routing & placement.

How the joule appears in your code

Every API response carries the joules used in an X-Energy-Joules response header. Most OpenAI-compatible SDKs expose response headers via response.headers or response.response.headers.

r = client.chat.completions.create(model="auto", messages=[...])
print(r.response.headers["x-energy-joules"])  # e.g. "0.31"

For batch workloads, query the per-workload joule total from the API:

curl https://api.greenjoules.cloud/v1/usage/workload/<id> \
  -H "Authorization: Bearer jc_…"

What a joule is worth

Joules at scale, with rough reference orientation:

Quantity	About
1 J	One second of a small LED bulb
100 J	A mid-size reasoning request on H100
1 kJ	A small image generation
1 MJ	Roughly 12 hours of a 25 W laptop
1 GJ	A long fine-tuning job on multiple GPUs

To see how we make sure your joules don't cost more than they have to, read Routing & placement. To see what the per-request record looks like, read Energy receipts.

What is a joule, here

How we measure it

Why joules instead of GPU-hours or tokens

How the joule appears in your code

What a joule is worth

Next