Over the years I have gravitated towards the idea that work entry issues are the single clearest indicator of problems in an organization. My first job after getting my Bachelors’s degree was for a women’s clothing company. We had salesmen jumping the queue to get their orders in the system or shipped earlier. We created all sorts of rules to help control the process. The rules generated a lot of overhead and anger. Reflecting on the last six months, I have seen many of the same issues with teams I have worked with. Work jumps the queue because no one takes the time to consider or reconsider urgency. How can software changes for daylight savings time ever be a surprise and become urgent? I am not naive enough to think every event or defect is predictable, but having pieces of work thrust into a sprint or iteration after it begins reflects failures of thought, control, and caring. The second category in our tour through why committed stories don’t get done in a sprint is not controlling the work entry process after a sprint starts. (If you think just having a work entry process is the solution, you are in for a rude awakening. Assuming you have a process, the process is never the root problem). Two of the most critical roots of work entry problems during a sprint/iteration are:
Lack of planning
A quote that has stuck with me over the years is “A lack of planning on your part does not necessitate an emergency on mine” (the quote is attributed to Bob Cater). While I am not a fan of big upfront planning or design, not spending the time to know what can and should be known before you start a process is silly. Unfortunately, my example of the change to daylight savings time is not apocryphal. Earlier this year I watched as a team had to scramble to change a hardcoded table (yeah . . .) during a sprint because it was not on the backlog. A little prodding exposed that the problem had occurred the previous fall and the spring before that and the team had discussed it in retrospectives. The tech lead shot down making system changes and the product owner pointed out that the fix only took a couple of hours to make, why take up room on the backlog for the issue — just do it. Neither of them had to put their work down and make the changes.
Potential Solutions: The nuclear option is to say no. In critical cases this option, severe problems will be overridden leading to the team feeling even less empowered than before. A far more useful solution is to ensure that there are consequences for adding work to an inflight sprint. Any piece of work inserted into a sprint will disrupt another piece of work unless there is slack in the system at the constraint in the process. This is expensive (avoid expediting work). If work enters, remove work (pause the work) that will not cause a ripple effect of urgency. The person that puts the work into the sprint needs to choose what to delay and if they are not the stakeholder for the work must get permission from the stakeholders harmed. They also must declare their culpability. Another, less punitive, approach is to adopt the concept of full-kitting suggested by Steve Tendon and Daniel Doiron in their book Tame your Work Flow (our re-read of Chapter 17, An Introduction to Full-Kitting). Full-kitting is more than just a definition of ready (which is another good idea) that ensures that once started work does not stop and start. As importantly, full-kitting ensures that people think about what can get in the way of getting to production BEFORE work starts which exposes lots of things that should be known and planned for before work is started.
Poor Quality
Code breaks for many reasons. As Dr. Lambda, Christian Clausen pointed out on SPaMCAST 623, code is a liability. We write code because we want the functionality it enables. The problem is that the code reflects a moment, it is often brittle, hard to understand and isn’t as maintainable as it should be (see the hard coding example above). Poor refactored code takes effort and time to fix that everyone would rather spend elsewhere. Code quality matters to everyone. When production is down it is difficult to tell a customer or a stakeholder that you’ll get to it next week (unless superficial, someone will think a problem is the end of the world and you need to deal with it now).
Possible Solutions: There are many potential solutions for improving code quality. They all start with the intent of delivering better solutions. The first solution is refactoring. Martin Fowler defines refactoring as “a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.” After solving the business problem developers should make sure the code is clean and maintainable which will reduce the number of defects sent to production and reduce the amount of time needed for repair (a double win). Other techniques that lead to better code include continuous integration (CI), mobbing, automated testing.
It might seem like a paradox but controlling work entry which means delaying the start of some work which makes it possible to get more work done in a more predictable manner.