
I wanted to share some of the great research we have over the years on cross-functional teams and flow. Most organizations want agility; they get trained in the processes, but we still don't take it to the next level of fixing what the processes reveal.
Many leaders I work with emphasize the process (e.g., Scrum). They don't realize that Scrum isn't the goal; agility is. Scrum is shining a light (very clearly by the way) on why we aren't agile. Sometimes we even blame Scrum, but that's another topic.
Let's focus on what it takes to be agile in our line of work (knowledge work).
Your team is working 60-hour weeks. Every developer has three projects going. Your backlog is massive. And somehow, nothing ships. What if I told you the problem isn't that you need more people... It’s that you're already doing too much.
I hear the same story from leaders every week. They walk into Sprint Planning and see 40 items in progress. They ask, "Why is nothing finishing?" The team says, "We're slammed. We need more people."
So they add contractors. Six months later, they're slower than before.
Here's what's actually happening, and why understanding constraints instead of obsessing over capacity will change everything about how your team delivers.
The symptom: high activity, low delivery
When you're fully loaded (meaning everyone's working on multiple things), you create queues everywhere. Design waits on research. Development waits on design. Testing waits on development. Every handoff is a delay. Every person is context-switching between three things, which means they're actually effective on zero things.
The research on multitasking is brutal.
I am really not saying anything new. Mike Cohn and Gerald Weinberg found the same thing in planning research. At 40% utilization on a single task, you're productive. Add a second task, and you drop to 20% per task. Add a third, and you're spending 40% of your day just switching contexts.
So the symptom is clear: high activity, low delivery. Everyone's working hard. Nothing's shipping. And leadership thinks the answer is more capacity.
The hard truth: your bottleneck isn't capacity, it's flow
Here's the truth most organizations don't want to hear: you don't have a people problem. You have a queue problem.
Queues are caused by two things: variability in the work and operating near 100% utilization.
Don Reinertsen lays this out in The Principles of Product Development Flow. He shows that as you approach 100% utilization, your queue size doesn't grow linearly. It grows exponentially. At 80% utilization, you have manageable queues. At 90%, queues double. At 95%, they explode. At 98%? You're looking at queues that are 50 times larger than your capacity to process them.
The DORA 2024 report makes this point clearly: systems thinking beats individual speed every single time. The report found that even when AI tools helped developers write code faster (which they did), it didn't improve throughput or stability at the system level. In fact, organizations that increased AI adoption by 25% saw a 1.5% decrease in throughput and a 7.2% drop in stability.
Why? The bottleneck wasn't in writing code. The bottleneck was getting good code into production. The constraint was downstream. All that extra code just piled up in the queue, waiting for testing, security review, and deployment slots. Faster code creation without addressing the constraint just made the queue longer.
Let me show you the math
Here's how bad this gets with one simple example. Let's say your team can handle 10 user stories per sprint. Your capacity is 10. Now let's compare two scenarios.
Scenario A: Limited WIP
You only allow 10 stories to be active at any time. When one finishes, you pull the next one.
Using Little's Law (queue time equals queue size divided by processing rate), your cycle time is:
Cycle Time = Queue Size / Processing Rate Cycle Time = 10 stories / 10 stories per sprint Cycle Time = 1 sprint per story
Result: Every story takes one sprint to finish.
Scenario B: Unlimited WIP (the "keep everyone busy" approach)
Leadership wants "everyone productive," so you start 30 stories. You still finish 10 per sprint (that's your capacity), but now you have 30 in flight.
Cycle Time = Queue Size / Processing Rate Cycle Time = 30 stories / 10 stories per sprint Cycle Time = 3 sprints per story
Result: Every story now takes three sprints to finish. Same people. Same capacity. Triple the delay.
Scenario C: Unlimited WIP + Context Switching Reality
But wait, it gets worse. Remember the context-switching tax? When people are juggling three times as many things, they slow down. Let's say they lose 30% productivity to switching. Now you're only finishing seven stories per sprint instead of 10.
Cycle Time = Queue Size / Processing Rate (adjusted for switching cost) Cycle Time = 30 stories / 7 stories per sprint Cycle Time = 4.3 sprints per story
Result: You just went from one sprint per story to more than four, simply by starting more work.
You cannot improve flow by running at 100%. You improve flow by limiting how much you start.
Find your constraint, not your capacity
In Eliyahu Goldratt's Theory of Constraints, the constraint is the resource that determines the throughput of your entire system. It might be your one senior architect. It might be your compliance review. It might be your deployment pipeline. Wherever it is, that constraint sets the pace for everything.
And here's the key insight: protecting that constraint is more valuable than keeping everyone busy.
If you have eight developers and one architect, and the architect can review two designs per sprint, your throughput is two per sprint. Period. Having the other seven developers start 14 new designs doesn't help. It creates a queue of 12 designs waiting on the architect. Those designs age, requirements drift, the market changes, and by the time the architect gets to design number 12, it's obsolete.
The DORA 2024 report supports this. High-performing teams don't optimize individual productivity. They optimize flow. The report found that teams with stable priorities (meaning they finish what they start before switching) have 40% less burnout and significantly higher throughput than teams with shifting priorities.
Finishing matters more than starting.
What to do Monday morning
Here's your practical playbook:
Step 1: Make your queue visible
Get a board (physical or digital) and put every active work item on it. Not "to do." Active. In process. If someone is touching it this sprint, it goes on the board.
Step 2: Count it
If that number is more than twice your completion rate, you have a queue problem. For example, if you complete five items per sprint and have 15 in progress, you're in trouble.
Step 3: Find the constraint
Where is work piling up? Is it waiting for design? Waiting for code review? Waiting for a deployment slot? That's your constraint. Put a sticky note on it. Name it.
Step 4: Set a WIP limit
Reinertsen and the Kanban community agree that WIP limits force rate matching. If your constraint can handle three things, limit the queue feeding it to five. Not 20. Five. When it fills up, upstream people stop starting new work and go help clear the constraint.
Step 5: Protect the constraint
This is straight from Goldratt. Don't let your constraint sit idle. Don't let low-priority work block it. If your senior architect is your constraint, don't have them in four-hour meetings. Don't give them administrative tasks. Protect their time like it's gold, because in terms of throughput, it is.
Three common traps to avoid
Trap 1: Thinking WIP limits mean people sit idle.
No. It means they swarm. If development can't start new work because testing is full, developers go help test. Cross-training is the countermeasure to constraints. Yes, it takes time, but it won't happen if we don't start doing it.
Trap 2: Setting WIP limits too high.
If your limit is 30 and you never hit it, it has no effect. A good WIP limit should hurt a little. It should force conversations about priority.
Trap 3: Blaming the constraint.
If QA is your bottleneck, the problem isn't QA. The problem is that you built a system that depends on a bottleneck. Fix the system, don't punish the person standing in the middle of it.
The bottom line
Constraint thinking isn't new. Goldratt wrote The Goal in 1984. Reinertsen has been teaching flow economics for 20 years. The Scrum Guide has emphasized focus and completion since 2010. But most organizations still operate under the assumption that more activity equals more value.
The DORA 2024 report provides fresh evidence: systems thinking outperforms local optimization every time. You can have the fastest developers in the world, but if they're feeding a constraint, you're just building a bigger queue.
So here's my challenge to you. This week, make your work visible. Count what's in progress. Find your constraint. Set one WIP limit. Just one. See what happens.
Stop rewarding people for starting things. Reward them for finishing.
References & Further Reading
- DORA 2024 Accelerate State of DevOps Report: https://dora.dev/research/2024/dora-report/
- Reinertsen, Donald G. The Principles of Product Development Flow: https://www.reinertsenassociates.com
- Sutherland, Jeff. Scrum: The Art of Doing Twice the Work in Half the Time: https://www.scruminc.com
- Cohn, Mike. Agile Estimating and Planning: https://www.mountaingoatsoftware.com
- Goldratt, Eliyahu M. The Goal: A Process of Ongoing Improvement
- The Scrum Guide (2020): https://scrumguides.org