Endpoint · hosted · waitlist

An endpoint.
A flat fee.
Unlimited use.

One OpenAI-compatible URL, served from our cloud of Apple silicon. Point Claude Code, Codex, or any SDK at it and run as many tokens as you want — for the price of a seat, not a meter.

Join the waitlist See plans

OpenAI compatible

Change the base URL. Keep the SDK.

Chompute speaks the OpenAI protocol end-to-end. Swap the endpoint and your existing tools, agents, and libraries work without a single other change.

◇ Python OpenAI SDK

01import os02from openai import OpenAI0304client = OpenAI(05    api_key=os.environ["CHOMPUTE_API_KEY"],06    base_url="https://chompute-services.dragonfruit.ai/openai/v1",07)0809completion = client.chat.completions.create(10    model="chompute-lite",11    messages=[12        {"role": "user", "content": "Run this agent on Chompute Inference."},13    ],14)1516print(completion.choices[0].message.content)

Plans

Pick quality. Pick speed. Keep use unlimited.

Quality is the model grade. Speed is how quickly your requests come back. Priced per seat, per month, unlimited use.

Your plan

Best · Standard

$100/ month

Unlimited use · per seat

Join the waitlist

No per-token billing. No overages. Ever.

All combinations

Endpoint quality and speed matrix
Quality	Fastest	Fast	Standard
Best
Advanced
Regular

Who it is for

For builders who want agents running, not billing debates.

If your agent loops have burned your monthly budget by day 11, this is for you.

Individual developers

Run Claude Code all day.

Replace a metered key with a flat monthly one. Iterate without watching a dashboard. Works with Claude Code, Codex, OpenCode, Continue, and any OpenAI-compatible client.

Startups scaling agents

Convert OpEx into a budget line.

Your agentic sprints compound. With Chompute, cost doesn't. Predictable monthly spend that doesn't explode the first time a loop does.

Routing

One endpoint can still make intelligent routing decisions.

The gateway keeps the client contract simple while Chompute decides where capacity should run.

◇ Gateway.liveRouting

POST /v1/chat › Refactor auth middleware across 12 files

routed → 72B • local◦ on-prem ◦ openai-compat

Waitlist

Get early access.

Tell us about your workload in the waitlist form. We will follow up with endpoint availability, pricing details, and the migration path for your stack.