Buffer your bottlenecks
Since the bottleneck is so critical, one of the things we want to do is make sure that it rarely (preferably never) runs out of work. You might wonder how a bottleneck can run dry, when, by definition, it is fed by a wider pipe? The thing to remember is that in our kanban system we're purposely limiting work-in-progress (WIP), and work items vary in size; so a couple of large items being processed upstream could end up temporarily starving the bottleneck.
The solution is to place a buffer in front of the bottleneck, as per the diagram below. In this example, the development team is the bottleneck so a buffer with a limit of 3 work items has been inserted immediately before it (the numbers at the top of the columns are the limits).
How do you size the buffer?
Each extra item of WIP carries a penalty in terms of lead time,
so start with a small number and adjust it up or down empirically. Another
thing you might consider is breaking up larger work items into smaller items, or the opposite –
bringing together several small items to form larger items. Reducing the variability in
size of the work items may allow you to reduce the size of the buffer.
How many buffers do we need?
Strictly speaking, at any point in time, there
can only be one bottleneck in the pipeline. However, when you have variation
– not just in terms of the size of work items, but in terms of the type of
work and the people involved –
the bottleneck may move from time to time.
After a while of running the kanban system you'll get a feel for where the
bottlenecks most commonly appear. Place your buffers appropriately.