LLM cost quotas
Last updated 2026-06-14
Definition
LLM cost quotas cap how much AI spend a workspace can run up. Chat answers and AI features call language models, and that costs money per request. A cost quota puts a ceiling on it per tenant, so a heavy month in one workspace stays within bounds instead of producing a surprise bill that lands on the whole account.
How to do this in Quri
- Check the AI cost ceiling tied to the workspace’s plan tier.
- Use chat and AI features as normal — each request counts toward the cap.
- Watch for the notice when a workspace nears or reaches its AI budget.
- Move to a higher tier when a workspace genuinely needs more AI headroom.
Frequently asked
- Why cap AI spend per workspace?
- Because LLM calls cost money per request, an unbounded month could run up a large bill. Capping per workspace keeps one tenant’s heavy usage from becoming a surprise charge across the whole account.
- What happens when a workspace hits its AI budget?
- AI features hold back until the budget resets, the same way quota enforcement works for other actions. Upgrading the plan tier is how you give a workspace more AI headroom.