Service Level Indicators (SLI)

What is SLI?

SLI, or Service Level Indicators (SLIs) are the quantifiable metrics that are used to measure the performance or quality of a service or an application. The SLI helps in knowing the current state of a service.

For example, what is the error rate, latency, and downtime that is permissible from an SMS API? Mutually aligning on the limits creates clarity between all the parties and sets up expectations.

SLI vs SLA vs SLO

In the above example, assume the FinTech business has a contract with the 3rd party that the API requests will have a latency p99 of ≤1s. There could be penalties associated in case the p99 latency increases beyond 1s. Additionally, the vendor team would respond to any ticket raised, within 24 hours.

The SLA in this case, would be the above-mentioned state. Basis this, the 3rd party might therefore define an internal threshold/goal, or SLOs at a service / API level (say 0.8s for API-1, 0.5s for API-2, and so on).

Now, when they measure it on an ongoing basis, the value being measured would be the SLI.

The gap between a pre-defined target (SLO) and actual value(SLI), is called the error budget.

Also, here’s a good blog by Google’s Cloud team, explaining the difference between the three in case you want to read more!

Doctor Droid assists companies in monitoring critical KPIs associated with the operations and product, helping companies keep the focus on customer experience.

Our team has deep experience in helping companies set up their monitoring and observability stack, so if you need any assistance in setting it up, we are happy to assist. You can reach out to us, here.

Made with ❤️ in Bangalore & San Francisco 🏢