Appliance · on-premises

Intelligence
behind your firewall.

A pre-configured cluster of Mac Studios and Mac minis shipped to your facility. We ship it, we enroll it, we keep it current. You get unlimited inference for every internal tool — with zero data ever leaving your network.

Chompute appliance rack with active inference nodes

2.56 TB

pooled unified memory

16-node cluster

2.6 kW

peak draw

8 Studio + 8 mini nodes

< 4 wk

capex breakeven

vs metered cloud APIs

egress to public cloud

all inference local

Who it's for

Teams who can't send their data anywhere — but still want the best models.

Enterprise IT and engineering leadership, running regulated workloads or guarding IP that never leaves the network.

Enterprise IT / CIO

Procurement-grade deployment.

✓Flat capex — no per-token metering, ever
✓HIPAA, SOC 2 Type II, and FedRAMP-ready posture
✓Apple Business Manager zero-touch enrollment
✓GitOps-managed runtimes, models, patches

VP Engineering

Architecture that actually holds up.

✓Pool 64GB to 256GB per node over Thunderbolt 5 RDMA
✓Run 7B → 1T-parameter models on one cluster
✓OpenAI-compatible gateway (drop-in for SDKs)
✓Observability: KV-cache hit rate, memory, queue depth

Hardware

Apple silicon. Memory where GPUs have none.

Unified Memory Architecture puts up to 256GB of high-bandwidth memory next to the neural engine — so one appliance runs what normally takes four enterprise GPUs.

Mac Studio class node

Runs deep reasoning, planning, and trillion-parameter MoE workloads. Cluster 4+ to serve frontier-size models over RDMA.

256 GB memory76 GPU cores215 W peakRole: Heavy

Mac mini class node

Handles routine extraction, structured output, and tool-use calls at high throughput. Stacks densely — no datacenter HVAC required.

64 GB memory70-100 tok/s110 W peakRole: Fast

How we ship

From purchase order to usable inference.

Our fleet management heritage means the appliance shows up ready. Your IT team never touches a terminal.

We spec your fleet

Tell us your workload mix (planning vs. extraction vs. vision). We size the cluster — usually 8–32 nodes.

Devices ship to your site

Pre-racked and labeled. Your team plugs in power and network. No image to flash, no firmware to chase.

Zero-touch enrollment

On first boot, Apple Business Manager authenticates hardware identifiers and joins the Chompute control plane.

Runtimes and models pull

Containerized MLX, vLLM, and LM Studio runtimes download locally. Models stream in over your WAN.

Gateway comes online

Your developers point their tools at your internal Chompute endpoint. OpenAI-compatible from minute one.

We keep it fresh

GitOps-driven continuous sync pushes model updates, security patches, and routing policies. Failover is automatic.

Reference spec

The Chompute Rack

Compute	8× Mac Studio class nodes + 8× Mac mini class nodes
Pooled memory	2.56 TB unified memory
Interconnect	Thunderbolt fabric
Max model size	7B to 1T open-weight route
Peak draw	2.6 kW
Footprint	Rack or shelf deployment
Noise	Network-closet friendly profile

Illustrative TCO

$76K+/yr

metered API baseline + usage overages

Fixed capex

owned capacity · no token overages

Cluster-16

Apple silicon fleet

online

Thunderbolt fabric2.56 TB pooled memory120 tok/squeue 03

M4-01fast

M4-02fast

M4-03fast

M4-04fast

M4-05fast

M4-06fast

M4-07fast

M4-08fast

STU-01heavy

STU-02heavy

STU-03heavy

STU-04heavy

STU-05heavy

STU-06heavy

STU-07heavy

STU-08heavy

Privacy

Hardware-level sovereignty.

Put inference capacity where the most sensitive work already lives. Chompute keeps the operational surface compatible while giving enterprises a real local-first path.

On-prem only

Keep sensitive prompts, source code, and records inside the customer-controlled environment.

HIPAA and SOC 2 ready

Built for enterprise diligence instead of “demo first, policy later” rollouts.

Remote attestation

Know which devices are enrolled, ready, and serving inside the fleet.

PII redaction

Add gateway-level controls before prompts move through agent workflows.

Verticals

Built for environments where control matters.

Engineering and IT

Code assistants, incident automation, CI review, internal knowledge agents.

Healthcare

Local document intelligence and care operations where PHI boundaries matter.

Industrial

Inference near facilities, telemetry, and operations teams that cannot depend on fragile cloud paths.

Marketing

Always-on creative and merchandising agents without runaway usage anxiety.

Tell us your workload

Bring us the agent loop that keeps growing.

We will help map it to appliance capacity, endpoint capacity, or a practical path that starts hosted and moves on-prem when the business case is clear.

Intelligencebehind your firewall.