# Don’t be mean: Statistical means and percentiles 101

Performance monitoring is an important part of running a successful application. One of the most basic ways to tell the performance of *something* is to measure the duration each time it happens and distill statistics from it.

# Mean

The mean or average of a collection of values is a good start to see how good or bad something behaves. It is calculated by summing all the values under consideration and then dividing by the number of occurrences.

In Ruby, this is what calculating the mean response time would look like:

def mean(array)

(array.sum.to_f / array.length).round(2)

end durations = [1,2,3,4,5,6,7,8,9,0]

mean(durations) #=> 4.5

**Note**: In the example, for a more accurate result when dividing, we cast the total duration value to a Float. Otherwise, Ruby would round down to the nearest Integer, returning `4`

instead.

# Median

Another useful statistic is the median. While it sounds similar, there’s a difference between the mean and median of a collection of values.

The median is the value separating the upper half of a set from the lower half of the set.

For a dataset with an odd number of values, you get the median by first sorting the values, then selecting the middle number. For a set with an even number of values, after sorting them, the median will be the mean of the two middle numbers.

def median(array)

sorted_array = array.sort

length = sorted_array.length if length.odd? # Middle number for odd arrays

sorted_array[length / 2]

else # Mean of two middle numbers

first_value = sorted_array[length / 2]

second_value = sorted_array[length / 2 - 1]

(first_value + second_value) / 2.to_f

end

end # Even array

durations = [1,2,3,4,5,6,7,8,9,0]

median(durations) #=> 4.5 # Odd array

durations = [1,1,2,3,4,5,6,7,8,9,0]

median(durations) #=> 4

This statistic is a good way of seeing if there is a huge skew in data or a long tail.

durations = [1,2,3,4,5,2000] median(durations) #=> 3.5

mean(durations) #=> 335.83

The mean for the durations above would be `335.83`

because of the single outlier of 2000ms. The median, which is only `3.5`

, indicates that there is a skew.

By calculating both the mean and median of a dataset, you can figure out if there are any large outliers or a long tail.

While mean and median are good indicators of performance, they don’t tell the whole story. If you request a webpage ten times, the mean could be very low, but one or more requests can still take a very long time to complete.

The image below shows the 99th (blue) and 90th (green) percentiles and the mean (red) for a certain action in AppSignal. You can see that the 99th and 90th are quite far from the mean and there are some spikes. This means that while your average customer has a good experience, every once in a while there’s a user who has to wait almost twice as long for the page to render. Ideally, you would want to get all these values as close to each other as possible, creating a more consistent experience for all your users.

For example, given the following duration set where 10 customers request a page with a duration between 100 milliseconds and 1 second.

`[100,100,100,100,100,100,100,100,100,1_000]`

This would result in a mean of just `190ms`

while one user had a very bad experience of a 1 second response time. When only tracking the mean, it's easier to think your website has great performance, while in reality every once in a while a user has a terrible experience.

The example above is only for 10 requests, but imagine if you had a thousand requests per day, that would mean a hundred of those users had a terrible experience.

# Percentiles

To give a better idea of the distribution of the values, we use percentiles. Percentiles are similar to the median — a number that signifies a point in the dataset where half of the set is below the number and half of it is above. Percentiles are similar in the sense that the 20th percentile means that 20% of the numbers in the dataset are below that number.

Given the following (sorted) set:

`[100,100,200,200,300,300,400,400,500,5_000]`

If we wanted to know the 20th percentile, we can calculate it in the following way: There are 10 values in the set. The wanted value is at position 1 (`20.0 / 100 * 10 - 1`

) as our arrays start at zero. Since this array contains an even amount of items, we have to calculate the mean between the index (`2`

) and index + 1 (`3`

). This would result in a value of `150`

for the 20th percentile.

A very naive Ruby implementation would look like this:

`def percentile(array, wanted_percentile)`

sorted_array = array.sort

index = (wanted_percentile.to_f / 100) * sorted_array.length - 1

# Check if index is not a round number

if index != index.to_i

sorted_array.at(index.ceil)

elsif sorted_array.length.even?

first_value = sorted_array.at(index)

second_value = sorted_array.at(index + 1)

(first_value + second_value) / 2

else

sorted_array.at(index)

end

end

# An array with an odd amount of numbers

durations = [100,200,200,300,300,400,400,500,5_000]

percentile(durations, 20) #=> 100

percentile(durations, 90) #=> 500

percentile(durations, 95) #=> 5000, index is a fraction, 9.5 the rounded index is 10

# An array with an even amount of numbers

durations = [100,100,200,200,300,300,400,400,500,5_000]

percentile(durations, 20) #=> 150, average of index 1 & 2 `(100 + 200) / 2`

percentile(durations, 90) #=> 2750, average of index 8 & 9 `(500 + 5000) / 2

percentile(durations, 95) #=> 500, index is a fraction, 8.55 the index is 9

This `percentile`

function looks very similar to our `median`

calculation and in fact, the `median`

is the same as the `50th`

percentile.

`durations = [1,2,3]`

percentile(durations, 50) == median(durations) #=> true

AppSignal uses the statistics above to generate performance metrics for your Application. We do not just rely on the mean/average but calculate the 90th and 95th percentiles to show outliers that give a better idea of the distribution of your requests. Find out more on our performance tour page.

# Oddities

Because of the way percentiles and averages are calculated, it’s sometimes possible to have the 90th percentile dip below the mean, for example, given the following dataset:

`durations = [1,1,1,1,1,1,1,1,1,1,2000]`

percentile(durations, 90) #=> 1

mean(durations) #=> 182.73

This would give us a `mean`

of `182.73`

, and a 90th percentile of just `1`

.

If your metric collection system only shows the 90th percentile and the mean, you’d still be able to deduce that there’s a huge outlier somewhere in your dataset if the 90th percentile drops below the average.

# You are almost at 100% of this post

That’s it for now! In another post, we’re going to talk about how we efficiently store and calculate percentiles for all our customer’s requests using Quantiles. If you have any questions or remarks about statistics and APMs, error tracking or performance monitoring, hit us up on Twitter @AppSignal or via email.

*Originally published at **blog.appsignal.com** on December 4, 2018.*