Averages, medians, percentiles: when each one is the wrong choice

In partnership with

Your operations dashboard says: Average delivery time: 2.8 days.

Your manager reads this and thinks everything is fine. Under 3 days? Great. Ship it.

But here is what is actually happening:

70% of orders arrive in 1-2 days
20% arrive in 3-4 days
10% arrive in 8-14 days

That last group—the 10%—is having a terrible experience. They are writing angry reviews. They are calling customer support. They are requesting refunds.

The "average" hid all of that. When you mix a bunch of fast deliveries with a few very slow ones, the average looks "fine." It smooths out the pain.

This is the most common analytical mistake I see. And it's not just a junior mistake—I have seen VPs make terrible decisions based on averages that were completely misleading.

Three ways to describe the exact same data

Let's say these are the delivery times (in days) for 10 recent orders: 1, 1, 2, 2, 2, 3, 3, 4, 8, 14

Mean (Average): Add them all up, divide by 10. (40 / 10) = 4.0 days
Median (P50): Sort them, take the exact middle value. 2.5 days
P90 (90th Percentile): The value below which 90% of orders fall. 8.0 days

Same data. Three very different stories:

The Average says "about 4 days" — sounds okay, slightly slow.
The Median says "most orders arrive in 2.5 days" — sounds great.
The P90 says "10% of our customers are waiting 8+ days" — sounds like a crisis.

Which one should your manager see? That depends entirely on the question they are asking.

When to use which metric

1. Use the MEDIAN for the "Typical" Experience

"What does a normal customer experience look like?" The answer is the median. Half of your customers had a faster experience, half had a slower one. The median isn't dragged up by extreme outliers.

Use for: Salaries, delivery times, session durations, typical order values.

-- Calculating Median (PostgreSQL/Snowflake)
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY delivery_days) AS median_delivery
FROM orders;

2. Use the MEAN (Average) when the Total matters

"What is our average revenue per user?" Here, the total revenue divided by total users is actually meaningful. One "whale" customer spending ₹10 Lakh matters to the business, even if they are an outlier. The mean captures that total financial impact. The median would hide it.

Use for: Revenue metrics, cost calculations, budget planning.

3. Use PERCENTILES for the Extremes

"What does our worst 10% look like?" That is P90 (or P95, or P99). This is critical for operational metrics where bad experiences have outsized consequences. Amazon famously optimizes for P99 page load latency, not average latency. If 1% of Amazon customers experience a 10-second page load, that is millions of angry people. The average might look fine. The tail is on fire.

-- Calculating P50, P90, and P95
SELECT
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY delivery_days) AS p50,
    PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY delivery_days) AS p90,
    PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY delivery_days) AS p95
FROM orders;

A real scenario where the Average almost cost a company money

True story. An e-commerce company was deciding whether to plaster a "2-Day Delivery Guarantee" banner across their website.

The average delivery time was 1.9 days. It looked safe. I asked them to run the percentiles.

Mean: 1.9 days
Median (P50): 1.4 days
P75: 2.1 days
P90: 3.8 days
P95: 5.2 days

25% of their orders were already exceeding 2 days. If they launched that "2-Day Guarantee," a quarter of their customers would instantly feel lied to. That isn't a rounding error—that is a massive brand reputation crisis.

They didn't launch the banner. Instead, they investigated the P90 tail (it turned out specific pin codes had poor courier coverage), fixed the logistics to bring P90 down to 2.1 days, and then launched the promise safely.

The analyst who ran those percentiles prevented a disastrous business decision. That is the job.

If you want to know how to write AI optimized blogs, check here:

AI Agents Are Reading Your Docs. Are You Ready?

Last month, 48% of visitors to documentation sites across Mintlify were AI agents—not humans.

Claude Code, Cursor, and other coding agents are becoming the actual customers reading your docs. And they read everything.

This changes what good documentation means. Humans skim and forgive gaps. Agents methodically check every endpoint, read every guide, and compare you against alternatives with zero fatigue.

Your docs aren't just helping users anymore—they're your product's first interview with the machines deciding whether to recommend you.

That means:
→ Clear schema markup so agents can parse your content
→ Real benchmarks, not marketing fluff
→ Open endpoints agents can actually test
→ Honest comparisons that emphasize strengths without hype

In the agentic world, documentation becomes 10x more important. Companies that make their products machine-understandable will win distribution through AI.

Make Your Docs Agent-Ready

The "Distribution" Mindset

Once you start thinking in distributions instead of single numbers, you will see data differently.

"Average order value is ₹1,200" → Okay, but what does the distribution look like? Is it a tight cluster where everyone spends exactly ₹1,200? Or is it bimodal—a bunch of ₹300 orders and a bunch of ₹3,000 orders averaging out to ₹1,200? Those are two completely different businesses.

When in doubt, bucket it.

-- Quick distribution check using buckets
SELECT
    CASE
        WHEN order_value < 500 THEN '1. 0-499'
        WHEN order_value < 1000 THEN '2. 500-999'
        WHEN order_value < 2000 THEN '3. 1000-1999'
        WHEN order_value < 5000 THEN '4. 2000-4999'
        ELSE '5. 5000+'
    END AS value_bucket,
    COUNT(*) AS orders,
    ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 1) AS pct
FROM orders
GROUP BY 1 ORDER BY 1;

Result:

0-499: 35%
500-999: 23%
1000-1999: 20%
2000-4999: 15%
5000+: 7%

Now you know the truth: Most orders are small. The ₹1,200 average is being violently pulled up by the 7% of people spending ₹5000+. If you are building a marketing campaign for the "typical" customer, you need to focus on the ₹0-999 range.

How to present this to your manager

Don't say: "The average you are using is misleading."

Say: "The average delivery time is 2.8 days, but I recommend we also start tracking P90, which is currently sitting at 6.1 days. That means 10% of our customers—roughly 3,000 orders a month—are waiting over 6 days. I dug into the P90 group and the delays are concentrated in [Specific Courier Partner]. If we fix that bottleneck, we can bring P90 under 4 days."

You just taught your manager a new metric AND gave them an action plan. That is how you build trust.

One thing to do this week

Take any metric you've reported recently; delivery time, session duration, order value. Calculate the Mean, Median, and P90.

If the Mean and Median are more than 20% apart, you have a skewed distribution, and your Mean is lying to you. Now ask yourself: Which number should the business actually be looking at?

P.S. Added new 25+ fresh remote jobs: realanalystjobs.com
Added new free projects : https://realanalystjobs.com/projects
Document your 2026 Journey: https://realanalystjobs.com/journey
Talk to me : https://realanalystjobs.com/raj