Skip to main content
LLM Platform

Bring your own LLM. Or use ours.

Pick a provider per workspace. Pick a model per app. Forecast spend before it lands and cap it before it bites. The model layer is yours, not ours.

Free during early access. No credit card required.

Provider catalog

Four ways to plug in a model.

Pick by capability, by price, or by your own compliance constraints. Switch any time — the prompt layer stays put.

O

OpenAI

GPT-4o, GPT-4.1, GPT-4o-mini. Bring your own org key, or use the workspace pool.

gpt-4ogpt-4.1gpt-4o-minio1-mini
A

Anthropic

Claude Opus, Sonnet, Haiku. Best for long-context, careful reasoning, and tool use.

claude-opus-4claude-sonnet-4claude-haiku-4
C

Cloudflare AI Gateway

Edge-routed inference with caching, rate-limiting, and analytics — across providers.

Workers AILlama 3MistralMulti-provider routing
S

Self-hosted

Point Omazy at your own OpenAI-compatible endpoint. vLLM, Ollama, Together, your VPC — your call.

OpenAI-compatiblevLLMOllamaCustom URL

Provider keys are encrypted at rest with per-column AES-GCM. Audit log on every read.

Per-app model selection

Different brand. Different model.

One workspace runs many apps. Cards can pin GPT-4o-mini for cost. Insurance can pin Claude Opus for nuance. No one-model-fits-all.

Per-app model selection Three brand apps — Cards, Loans, Insurance — each routed to a different LLM model. Workspace · acme-bank Per-app model router App · Cards Volume · 14k/day OpenAI gpt-4o-mini Optimised for cost $0.12 / 1k msgs App · Loans Volume · 3.2k/day Anthropic claude-sonnet-4 Disclosure-aware $0.84 / 1k msgs App · Insurance Volume · 1.1k/day Anthropic claude-opus-4 High empathy · long context $3.20 / 1k msgs

Switch a model from the workspace UI in seconds. Active sessions roll over on the next message.

Forecasting + caps

Spend that won't surprise you.

Set a monthly budget per app or per workspace. Omazy forecasts where you'll land and steps in before you blow it.

Warn

80%

Email + in-app alert to billing owners. Service stays on, no degradation.

Pause

95%

New AI replies pause. Human inbox stays live. Owners get a one-click "raise cap" link.

Hard cap

100%

No new spend until next billing cycle or owner override. Audit trail logged.

App · Cards · monthly budget

Cap: $2,400

WARN · 80% $1,920 PAUSE · 95% $2,280 CAP · 100% Now · $1,488 · 62%

Forecast EOM

$2,310 · will hit warn

Days remaining

11 days

Burn rate

$74 / day · steady

Usage dashboard

See every token. Per app. Per model.

Streamed from the trace bus, written to ClickHouse, surfaced live in the workspace dashboard. Slice by app, model, channel, or time of day.

workspace.omazy.cx/acme-bank/llm
Last 30 days Live

Total spend

$3,184

+12% vs last 30d

Tokens in

14.2M

47% prompt cache hit

Tokens out

4.7M

−8% vs last 30d

Avg cost / msg

$0.041

across 3 apps

Daily spend by app

Cards Loans Insurance
$200 $150 $100 $50 Apr 1 Apr 8 Apr 15 Apr 22 Apr 30

OpenAI · gpt-4o-mini

53%

App · Cards

$1,684

Anthropic · claude-sonnet-4

28%

App · Loans

$890

Anthropic · claude-opus-4

19%

App · Insurance

$610

Mock screenshot. Live dashboards stream via the analytics outbox + ClickHouse pipeline.

Your keys, your contract

Bring your existing OpenAI / Anthropic org. We never proxy through a Omazy-owned key unless you opt in.

No model lock-in

Switch a model from a dropdown. Your prompt templates, automations, and KB stay put.

Spend governance

Owners set caps. Operators see forecasts. Nothing escalates without an audit trail.

Pick your LLM. Ship in minutes.

Free during early access. Start with our pool, switch to your own keys whenever you're ready.

✓ Free during early access✓ No credit card✓ Switch providers any time