Skip to main content

Command Palette

Search for a command to run...

Understanding Disks, Part 1: Why Disk I/O Even Exists

Published
5 min read

Before you look at iostat, before you blame disks, before you scale hardware — you need one correct mental model.

Let’s build it.


Why does disk I/O even exist?

A program never wants disk.

A program wants data.

Disk I/O happens only in two situations:

  1. Read: data is not in RAM

  2. Write: data must be made durable

Everything else you observe — latency, queues, utilization — is just a consequence of this.

So when someone says:

“The disk is slow”

They usually mean one (or more) of these:

  1. High access latency → I/O takes time

  2. Too many requests → IOPS pressure

  3. Too much data → throughput limit

That’s it.

Every disk metric you’ll ever see maps back to these three forces. Nothing more.


What actually happens when a process needs data?

Let’s walk the real path.

When a process requests data:

  1. The kernel checks the page cache

  2. If the data is there → no disk I/O

  3. If not → kernel issues a block I/O request

  4. The request enters a queue

  5. The disk services it

  6. The process wakes up

Critical detail (do not skip this):

The process is sleeping, not “using disk”.

That single fact explains most production confusion.

So remember:

  • High disk stats ≠ high CPU usage

  • A waiting process ≠ a busy CPU

  • A sleeping process consumes zero CPU


What iostat actually observes

iostat does not see:

  • Page cache hits

  • Application logic

  • Why the I/O happened

It sees only one thing:

Block I/O requests that reached the disk

That’s it.

So iostat answers:

“What happened after a cache miss?”

It does not answer:

“Why is my application slow?”

If you expect iostat to explain slowness by itself, you’re already wrong.


The only promise iostat makes

iostat promises exactly four things:

  1. How many requests hit the disk

  2. How big those requests are

  3. How long they take

  4. Whether they pile up

Nothing more. Nothing less.

Expect more, and you’ll misread it.


Stripping iostat -x to its bones

Forget columns. Think in questions.

1️⃣ Are requests even happening?

r/s, w/s

If these are near zero, the disk is irrelevant.

Stop looking at it.


2️⃣ How much data is moving?

rkB/s, wkB/s

This tells you whether you’re dealing with:

  • Lots of small I/O

  • Or fewer large transfers

Throughput problems live here.


3️⃣ Are requests slow?

await

This is end-to-end latency — from submission to completion.

But:

High await does not automatically mean bad disk.

Hold that thought.


4️⃣ Are requests queueing?

avgqu-sz

This tells you whether the disk (or something before it) can keep up.

Queue ≠ broken disk. You’ll learn why later.


5️⃣ Is the disk busy at all?

%util

This only tells you whether the disk had work.

It does not prove saturation by itself.


That’s it.

Everything else is decoration — for now.


The first mental checkpoint (critical)

Say this out loud:

High await does not automatically mean the disk is bad

Sometimes the disk is fine, but:

  • I/O is serialized

  • Requests are throttled

  • Forced syncs (fsync, journaling) are in play

You don’t need to understand those yet — just accept that latency ≠ disk failure.


Glue this model permanently

Process requests block access → goes to sleep → wakes on completion

This is the core mental model.

If a process is waiting on disk I/O:

  • CPU is not the bottleneck for that thread

  • CPU cannot help until I/O completes

  • Adding CPU does nothing

So:

I/O wait is not CPU starvation

For that thread, they are mutually exclusive.


Four common system states

Case 1: CPU high, disk low

→ CPU-bound workload
→ Disk stats irrelevant


Case 2: Disk high, CPU low

→ I/O-bound workload
→ Processes sleeping
→ System “feels slow”


Case 3: Disk high, CPU high

→ Mixed workload
→ Often bad application behavior (sync I/O, poor batching)


Case 4: Disk low, CPU low, system slow

Not a disk problem
→ Look at locks, memory pressure, network, or application logic


Repeat this until it sticks:

CPU executes.
Disk serves.
Queues wait.
Processes sleep.


The first real insight

Here it is:

If CPU is low and disk stats are high,
the system is slow because progress depends on I/O completion — not execution speed.

That single sentence explains 80% of production disk incidents.


Final lock-in question

What does a sleeping process tell you about CPU usage at that moment?

Answer:

When a process issues a blocking disk read and the data is not in page cache:

  • The kernel submits a block I/O

  • The process is put to sleep

  • The process consumes zero CPU until completion


Two clarifications you must internalize

1️⃣ CPU is low for that process, not necessarily the system

Other processes may still run.

Correct model:

I/O wait removes the requesting process from the run queue

Incorrect model:

“The CPU becomes idle”


2️⃣ A queue does not always mean an overloaded disk

Requests may queue because of:

  • Block layer scheduling

  • Device ordering

  • Journaling or fsync

  • Enforced serialization

This will matter later when interpreting avgqu-sz.


Foundation complete

Burn this into your brain:

Sleeping on disk I/O = zero CPU consumption for that thread

If you forget this, every disk metric you read will lie to you.