Home › SLA Academy
SLAs, demystified
An uptime percentage is a promise. A service-level agreement is what turns that promise into money when it breaks. Here’s where SLAs came from, how to calculate what you’re actually owed, and how to factor them into every SaaS — and AI — decision you make.
You’re renting reliability
Every SaaS and AI provider you depend on is a single point of failure you don’t control. The SLA is the one place a vendor puts a number on how reliable they promise to be — and the only place they agree to pay you back when they miss. Read it before you sign, not after you’re down.
2 A short historyWhere service credits came from
Renting compute by the hour
Service bureaus sold time on shared mainframes. Customers paying for access started expecting a defined level of service — the seed of the idea.
The SLA — and the credit — is born
Telecom carriers formalized the “service-level agreement”: a contractual availability target plus a remedy. Miss the target, and the customer gets a credit on their bill. That pairing — promise + payback — is the modern SLA.
Uptime goes mainstream
Web hosts competed on “99.9%”. In 2008 AWS published an SLA for EC2 and S3, making the “nines” both a marketing line and an enforceable contract term for the whole industry.
An SLA on every tool
As companies moved their stack to SaaS, the SLA became a standard procurement checkbox — and a negotiation lever for enterprise contracts.
Renting intelligence by the token
Now critical features run on third-party AI APIs. Many ship with no SLA, or one only on a private enterprise tier — so knowing how to read (and demand) an SLA is a survival skill, not a formality.
The four parts of any SLA
The commitment
The headline availability target, e.g. 99.9% per month. Bigger isn’t always real — check how “downtime” is defined and measured.
The remedy
What you get when they miss — usually service credits (a % of that month’s bill). Some SLAs offer only termination; many offer nothing at all.
The claim window
Credits are rarely automatic. You typically must file a claim within a deadline (often 30 days). Miss it and the credit evaporates.
The exclusions
Scheduled maintenance, beta features, force majeure, your own misconfig — the fine print that quietly shrinks the promise.
What an uptime % actually buys you
Each extra “nine” cuts allowed downtime by ~10×. Tap a target to see how much outage it permits.
At 99.9%, a single bad afternoon can blow the entire monthly budget.
How much are you owed?
Enter your monthly spend, the committed SLA, and the uptime you actually observed. We apply a common tiered credit schedule (10% / 25% / 50%).
6 Two lensesHow DevOps and FinOps each use the SLA
DevOps / SRE
- SLA vs SLO vs SLI. The SLA is the contractual promise to customers; the SLO is the stricter internal target you run to; the SLI is the metric you measure.
- Error budgets. 100% minus your SLO is how much failure you’re allowed to spend on shipping fast — built directly from these numbers.
- Trust but verify. Monitor a provider’s real uptime; their status page is marketing, your synthetics are evidence.
- Design around the weakest SLA. Multi-region, retries, and fallbacks exist because one upstream will spend its downtime budget.
FinOps / Procurement
- A credit is not a refund. It offsets a future bill for that one service — it never repays your lost revenue or eng time.
- Claim hygiene. Track outages, file within the deadline, keep evidence. Unclaimed credits are simply forfeited.
- Negotiate at renewal. Higher uptime tiers, bigger caps, and automatic credits are real line items you can push for.
- Price the risk. A cheaper vendor with no SLA can be the expensive choice once you weight it by downtime cost.
Where the SLA shows up with every provider
Evaluate
Read the SLA before signing. No public SLA? Treat the service as best-effort and price that risk in.
Negotiate
For anything critical, ask for the enterprise SLA tier, a higher credit cap, and automatic credits in writing.
Operate
Monitor real uptime, alert on breaches, and file credit claims inside the window — every time.
Renew
Bring the outage record to renewal. A bad year is leverage for better terms or an exit.
Why this matters more every quarter
As you build on AI, you inherit its reliability — and its fine print
We track 31 AI & ML providers in the directory. Many ship their standard API “as is” with no uptime SLA, or publish one only on a private enterprise tier. When an AI endpoint is in your critical path, “no SLA” is a business risk you’re carrying whether you’ve priced it or not.
Five quick questions
1. A 99.9% monthly uptime SLA allows roughly how much downtime per month?
≈43 minutes. Each extra nine cuts that ~10×: 99.99% ≈ 4 min, 99.999% ≈ 26 sec.
2. When a vendor misses its SLA, a “service credit” is usually…
It’s a bill credit for that one service — not cash, and never your downstream losses.
3. The difference between an SLO and an SLA is…
SLO = internal objective (usually stricter); SLA = the external, contractual commitment with a remedy.
4. A provider’s terms say the service is provided “as is” with no uptime commitment. You have…
“As is” means best-effort. There’s no commitment and no remedy — common for self-serve AI APIs.
5. Who usually has to start the credit claim?
Unless the SLA says credits are automatic, the customer must file within the claim window or forfeit them.