An endpoint.
A flat fee.
Unlimited use.
One OpenAI-compatible URL, served from our cloud of Apple silicon. Point Claude Code, Codex, or any SDK at it and run as many tokens as you want — for the price of a seat, not a meter.
Change the base URL. Keep the SDK.
Chompute speaks the OpenAI protocol end-to-end. Swap the endpoint and your existing tools, agents, and libraries work without a single other change.
01import os02from openai import OpenAI0304client = OpenAI(05 api_key=os.environ["CHOMPUTE_API_KEY"],06 base_url="https://chompute-services.dragonfruit.ai/openai/v1",07)0809completion = client.chat.completions.create(10 model="chompute-lite",11 messages=[12 {"role": "user", "content": "Run this agent on Chompute Inference."},13 ],14)1516print(completion.choices[0].message.content)Pick quality. Pick speed. Keep use unlimited.
Quality is the model grade. Speed is how quickly your requests come back. Priced per seat, per month, unlimited use.
Best · Standard
| Quality | Fastest | Fast | Standard |
|---|---|---|---|
| Best | |||
| Advanced | |||
| Regular |
For builders who want agents running, not billing debates.
If your agent loops have burned your monthly budget by day 11, this is for you.
Run Claude Code all day.
Replace a metered key with a flat monthly one. Iterate without watching a dashboard. Works with Claude Code, Codex, OpenCode, Continue, and any OpenAI-compatible client.
Convert OpEx into a budget line.
Your agentic sprints compound. With Chompute, cost doesn't. Predictable monthly spend that doesn't explode the first time a loop does.
One endpoint can still make intelligent routing decisions.
The gateway keeps the client contract simple while Chompute decides where capacity should run.
Get early access.
Tell us about your workload in the waitlist form. We will follow up with endpoint availability, pricing details, and the migration path for your stack.