Saturday, September 3, 2011

Unplanned items, velocity and prisoner check-mark technique






(note added: I'm going to write a short version of this post later. No need to read all of it now then)
This post is about a way to give visibility on what is happening during the sprint, including the "unplanned items", using an unplanned items burn up, measuiring the actual time by "prisoner metrics" check-marks, and having not just the velocity in term of total item of the sprint, but also the the "net velocity", calculated considering the time actually available during the sprint.
Rearranging the items for the gap between actual average velocity, and estimated, the team can effectively answer to the question "what went wrong".
We will able to see:
- actual time spent in working on the item.
- actual time spent on unplanned items.
- the actual velocity of the sprint, and a projection of what that velocity could have been if we didn't have unplanned items. This is using the "net velocity".
- at the end of the sprint we may have also visibility of what are the tech area we were late, and some guesses of why, particularly using information from the skill matrix.
The team is not supposed to add anything to the sprint, but that happens anyway: fixing bugs coming from the previous sprint, attending important meetings, or having some slack time. That actually modifies the container.
If we can't get rid of unplanned items, at least let's have visibility about.

Instead of aborting sprints, or, worst, tending to hide what's really happening, or (better) using focus factor tecquique, I'd suggest an alternative that consist in making visible:
- the actual time spent in any item
- the burn up of the time spent in unplanned items.
At the end of the sprint it can be showed where there is the need of improve, distributing the item colored with their technology area, with the rank of the skill of who worked on what items.

Here we can see a simulation of sprint:
We have a 10 days sprint of a "team" of two member (just for simplicity), and an estimated velocity of 10 story point (i.e. 0.5 story point for man/day)
(Please, note that I am not dealing so much about tasks and user stories. In the first part of the post the items we could consider them much more as they were user stories)
Here is the taskboard at the day 0:
The A and B item are selected for "doing":
After one day, we use the "prisoner metrics", that simply means using check mark to visualize the time spent on any item:

After one day, nothing changes, except that we add another checkmark to both the items:

At the beginning of the fourth day, item A is done, another mark is put on it (so the information about how much days was taken is consistent).

...the item C is in doing now:
Another day passes. And item C is check-marked appropriately:
Day 6: B is finished, it has another mark (and a cross mark as well, a big cross mark has to be put each 5 working days), and the item D is checked out.
Unfortunately, during the day an unplanned item is added and the C activity is interrupted. We track it putting that activity over C, in the following way:

The question now is what about the burn down and the unplanned item.
The burn down will probably be flatter, but we may not know the reason, then a new "burn up" chart will be useful to show how much time is being subtracted from the sprint.
In this new graph the Y axis value are based on time, and not in story points. In fact we are not estimating the new items. There is no need for this.
Another day passes, and we mark the unplanned item with a checkmark as well, and what we do with the C that is still there, unfinished, though one day passed anyway? We decide to checkmark it anyway with a different color (red). And the burn up tracks the time:
Day 8. Same song:
Day 9: we realize that D is finished.
... so the activities can be parallelized:
Day 10: the unplanned activity is done:

One of the two guys is free, and to is (legitimately) taking a slack. We make evident this as well, so we will be able to consistently count all the checkmarks at the end of the sprint (for example about counting the actual man day available in the sprint).
Day 11. All is done, and the Sprint is at the end:
What we can see here:
- the amount of the black checkmarks is the total of the man days that were available during the sprint
- the amount of the black checkmarks on the planned items is the numbers of man days spent on what was decided during the sprint plan.
- the amount of the red checkmarks shows how much time some activities have been spent in blocked state because of urgent unplanned items.
- the burn up chart, and the total of the black checkmark on the unplanned items and the slacks, they both show how much time is subtracted form the planned items.
- the burn down works is showing the remaining to work story points, as usual.

About the sprint velocity:
the velocity here is 10 story points. However what if the unplanned items and the slack were "invisible"?
1) we could have reported that in "ten man days" we have been able to work on twenty story points, so that the day man velocity is 0.5 story point.
2) we may have given the false perception that it is OK adding unplanned items during the sprint.
3) we forget to consider those data, and so we are not able to see them retrospectively in order to set some improvement direction during the retrospective meeting.
In this way we could know the Expected Value of the velocity of the Sprint if there were no interruptions.
To calculate it we just have to multiply the story points with the ratio between the available man days and the man days spent on actual planned items:
10 * 20 / 15 = 13.33
We can easily find, dividing this value with 20 (man days available on the sprint), that the average story point for man day is approx 0.66 (not 0.5).
Thus we can have a better projection on the "actual" velocity, and we can understand what would happen if we remove the interruption (as should be) on the next sprint.
We may also discuss about what really will make us slower, just in case (external interruption or bad code quality?). For example, according the the "quality of non declining velocity model" if the problem is low code base quality, the effect will be the net velocity, and of course if the problem is interruption, the effect is the total velocity.
May be we have also both, so then it is much difficult to decide what to do.
Here we can see how to reason about those data, for example when we have a retrospective.
We know how much is the gap between the average velocity and the actual velocity for any item.
For example. The A item, had an estimate of 3 story point, and toke 3 man days. According to the actual "net velocity" the expectation of story points done in three days is 0.66 * 2 = 2. So this story has a +1 respect of the estimation.
Let's make some calculations about all these actual vs estimated
(p.s. and note that thought the comparison uses time, there are no actual time based estimates).

In interesting game is setting out the items up on a X,Y axis (X is the timeline, and Y is the distribution of the gab between the actual and expected time, according to the net velocity):
(We don't care about the Y of slacks or unplanned items because are not estimated and shouldn't)
Moreover, if the item are proper tasks (instead of generic items) we can introduce more dimentions to this diagram, using color for tech areas, and symbols from the skill matrix.
(task are not supposed to have story points, but that does not matter so much. We can still distribute story points of a story to it's tasks, for instance)
So now, dividing the items per tech area (that makes perfectly sense), we can see something like:

In blue items we performed better than the others.
One of the reason could be that the people that worked on them were not so much skilled in that area. To make it evident we could have added the values taken from the skill matrix when working on that item.
Tipically each entry in the skill matrix is empty, has a dot, or a star.
Wen we rearrange the item at the end of the sprint we could have something like follows (I replaced the dot with a 'x'):
It's perfectly clear that the guys that worked on the area that performed well, are skilled in there, and that may be the most likely reason for the performance, so we don't worry too much about, except that we may have some new challenge, as improve some skill in some area where we are weaker.
In other cases, for example when the skill matrix data are randomly distributed on the Y axis, our finding can be different (for example that the skill matrix need to be updated realizing that our team became more cross functional).

Some actions that can be taken about the relevance of some weakest area, is the re-estimation of components that are more closer to some area where we realized are weaker than expected.
Or we may adjust the skill matrix in a more realistic way, and so on...






No comments: