The question of how much data is required to determine what is happening in a system is a perennial bugaboo. Those predisposed to acting tend to think less is more, while those with more reticence sometimes wait forever to make a decision.  The question of how much data is needed is more than just a footnote in flow data.

In the previous chapters, Vacanti notes that “the two fundamental properties of our data that must be met for any XmR analysis to be valid: “1. Successive data values must be logically comparable. 2. The moving ranges need to isolate and capture the local, short-term, routine variation that is inherent to the process.” Defining good limits (see Chapter Four Part 2) requires a good baseline or you risk false alarms.

The author leverages a  study performed by Dr. Henry Neave, former Deming Professor of Management at Nottingham Trent University, to determine sample size. The study indicates that a sample size of 10 reduces the chance of a false alarm to below 2.5% while increasing the sample size to 20 has very little added improvement. No sample size drives the chance of false alarms to zero (for those who want to wait one more sprint, waiting for data from the next sprint isn’t going to be that useful). If ten data points are sufficient for a baseline, the question then is which ten? This is where the principle of “logically comparable” deserves significant consideration.

Returning to Week 3 of the re-read, let’s consider the US Employment-population Ratio: Age 25 to 35 (it is a big data set with a shock in the data). If we used any 10 months from 2020 would that baseline be logically comparable to any other 10 months in the data set?  No. 

Shocks, like COVID, are easy to “see” in the data. In other circumstances determining whether one group of data is logically comparable may be more difficult. Would setting Upper and Lower Natural Process Limits based on a sample from 2019 be logically comparable to 2022 or 2023? I would make that comparison even though many would argue that the world is a very different place.  

The X Chart with natural process limits (based on the entire sample) for the labor participation rate for 25 to 34-year-olds is shown below. 

Except for the two recovery years of 2021 and 2022, most observations are outside the boundaries of the limits and therefore are exceptional. The limits set based on the whole dataset are not useful. Before we diagnose the problem, let us consider the second principle. 

The second principle is “the moving ranges need to isolate and capture the local, short-term, routine variation that is inherent to the process.” Constructing the mR chart for the dataset will help assess what is natural variation in the data and what is exceptional.  

Based on the whole data set the April 30, 2020 observation is the only exceptional observation. 

If we use observations from the first ten months of 2022 to reset the baseline, the limits in the XmR chart look very different. 

The participation rates seems to have recovered to reflect the same systemic levels as before COVID. The baseline also shows the COVID shocks to be exceptional — outside of the norm (no great shock – pun intended). 

When should you reset the baseline?  When shocks change the capability of the system, COVID in this case. Or, when data goes stale. For example, I regularly work with dynamic teams, over time as people change so do the profile of the team and their performance characteristics. When that happens the distant past is not useful; consider using throughput data from a 1990 team using COBOL to analyze a modern software development team. 

Buy a copy and get reading – Actionable Agile Metrics Volume II, Advanced Topics in Predictability.  

Week 1: Re-read Logistics and Preface https://bit.ly/4adgxsC

Week 2: Wilt The Stilt and Definition of Variation https://bit.ly/4aldwGN

Week 3: Variation and Predictability  – https://bit.ly/3tAVWhq 

Week 4: Process Behavior Charts Part 1https://bit.ly/3Huainr

Week 5: Process Behaviour Charts, Part 2https://bit.ly/424O5Wc