Home › SLA Academy

SLAs, demystified

An uptime percentage is a promise. A service-level agreement is what turns that promise into money when it breaks. Here’s where SLAs came from, how to calculate what you’re actually owed, and how to factor them into every SaaS — and AI — decision you make.

5min
read + play
3
interactive tools
99.9%
= 43 min/mo down
1 Why this matters

You’re renting reliability

Every SaaS and AI provider you depend on is a single point of failure you don’t control. The SLA is the one place a vendor puts a number on how reliable they promise to be — and the only place they agree to pay you back when they miss. Read it before you sign, not after you’re down.

2 A short history

Where service credits came from

1960s–70sMainframes

Renting compute by the hour

Service bureaus sold time on shared mainframes. Customers paying for access started expecting a defined level of service — the seed of the idea.

1980s–90sTelecom

The SLA — and the credit — is born

Telecom carriers formalized the “service-level agreement”: a contractual availability target plus a remedy. Miss the target, and the customer gets a credit on their bill. That pairing — promise + payback — is the modern SLA.

2000sThe cloud

Uptime goes mainstream

Web hosts competed on “99.9%”. In 2008 AWS published an SLA for EC2 and S3, making the “nines” both a marketing line and an enforceable contract term for the whole industry.

2010sSaaS

An SLA on every tool

As companies moved their stack to SaaS, the SLA became a standard procurement checkbox — and a negotiation lever for enterprise contracts.

2020sThe AI era

Renting intelligence by the token

Now critical features run on third-party AI APIs. Many ship with no SLA, or one only on a private enterprise tier — so knowing how to read (and demand) an SLA is a survival skill, not a formality.

3 Anatomy

The four parts of any SLA

%

The commitment

The headline availability target, e.g. 99.9% per month. Bigger isn’t always real — check how “downtime” is defined and measured.

$

The remedy

What you get when they miss — usually service credits (a % of that month’s bill). Some SLAs offer only termination; many offer nothing at all.

The claim window

Credits are rarely automatic. You typically must file a claim within a deadline (often 30 days). Miss it and the credit evaporates.

The exclusions

Scheduled maintenance, beta features, force majeure, your own misconfig — the fine print that quietly shrinks the promise.

4 Play: the nines

What an uptime % actually buys you

Each extra “nine” cuts allowed downtime by ~10×. Tap a target to see how much outage it permits.

Allowed down / day
Allowed down / month
Allowed down / year

At 99.9%, a single bad afternoon can blow the entire monthly budget.

5 Play: the credit calculator

How much are you owed?

Enter your monthly spend, the committed SLA, and the uptime you actually observed. We apply a common tiered credit schedule (10% / 25% / 50%).

Downtime this month
Credit tier
Estimated credit

6 Two lenses

How DevOps and FinOps each use the SLA

DevOps / SRE

  • SLA vs SLO vs SLI. The SLA is the contractual promise to customers; the SLO is the stricter internal target you run to; the SLI is the metric you measure.
  • Error budgets. 100% minus your SLO is how much failure you’re allowed to spend on shipping fast — built directly from these numbers.
  • Trust but verify. Monitor a provider’s real uptime; their status page is marketing, your synthetics are evidence.
  • Design around the weakest SLA. Multi-region, retries, and fallbacks exist because one upstream will spend its downtime budget.

FinOps / Procurement

  • A credit is not a refund. It offsets a future bill for that one service — it never repays your lost revenue or eng time.
  • Claim hygiene. Track outages, file within the deadline, keep evidence. Unclaimed credits are simply forfeited.
  • Negotiate at renewal. Higher uptime tiers, bigger caps, and automatic credits are real line items you can push for.
  • Price the risk. A cheaper vendor with no SLA can be the expensive choice once you weight it by downtime cost.
7 The lifecycle

Where the SLA shows up with every provider

Evaluate

Read the SLA before signing. No public SLA? Treat the service as best-effort and price that risk in.

Negotiate

For anything critical, ask for the enterprise SLA tier, a higher credit cap, and automatic credits in writing.

Operate

Monitor real uptime, alert on breaches, and file credit claims inside the window — every time.

Renew

Bring the outage record to renewal. A bad year is leverage for better terms or an exit.

8 The AI reality

Why this matters more every quarter

As you build on AI, you inherit its reliability — and its fine print

We track 31 AI & ML providers in the directory. Many ship their standard API “as is” with no uptime SLA, or publish one only on a private enterprise tier. When an AI endpoint is in your critical path, “no SLA” is a business risk you’re carrying whether you’ve priced it or not.

9 Test your SLA IQ

Five quick questions

1. A 99.9% monthly uptime SLA allows roughly how much downtime per month?

≈43 minutes. Each extra nine cuts that ~10×: 99.99% ≈ 4 min, 99.999% ≈ 26 sec.

2. When a vendor misses its SLA, a “service credit” is usually…

It’s a bill credit for that one service — not cash, and never your downstream losses.

3. The difference between an SLO and an SLA is…

SLO = internal objective (usually stricter); SLA = the external, contractual commitment with a remedy.

4. A provider’s terms say the service is provided “as is” with no uptime commitment. You have…

“As is” means best-effort. There’s no commitment and no remedy — common for self-serve AI APIs.

5. Who usually has to start the credit claim?

Unless the SLA says credits are automatic, the customer must file within the claim window or forfeit them.