THE AGILE CULTURE: LEADING THROUGH TRUST AND OWNERSHIP (2014)
Chapter 9. Metrics
The Big Ideas
People do what they are measured by.
We need metrics that focus on delivering business value, foster trust, and do not take away ownership.
If we are moving toward Energy and Innovation (the green) we need an indication of the progress we are making so that we can tune our actions.
Why Metrics Matter
I have spent my career consumed by metrics. According to the old saw, “If you can measure it, you can manage it.” In our quest to manage everything, we sometimes implement a wide range of overlapping, competing metrics and sometimes useless ones. For example, a few years ago a technology company asked me to work with its software engineering team to finalize how to effectively measure software productivity and quality. My first interaction with the team was to review the metrics they had already defined and agreed were important. The head of engineering proudly fired up the projector so that he could show me on the big screen the 63 metrics the department had decided were critical.
I was taken aback. “Sixty-three? Do you think that covers it all or are you missing some?”
The head of engineering was not yet used to my cynical observations and so replied, “Well, there are some others that we could not get consensus on that some felt were important. They are on a different worksheet.” He then highlighted a different tab to show the 19 other metrics that could have made the cut, and continued, “If you think these are worth including, I am sure we can figure out a way to get them in.”
I looked around the room. This group had put a lot of effort into defining these measures, and I did not want to devalue their work in any way, but 63 metrics? My personal rule of thumb is that you need only five to seven metrics. Any more than that and you are likely trending into meaningless measures. But how could I get that point across without causing harm?
Then, I had an epiphany. “We want metrics that help us discriminate between activity and accomplishment. With good metrics, we will know that we are getting things done, not just being busy. Let’s take a look at your metrics. Which align with getting the right things done, making progress on your critical business goals?”
With this, we launched into a great conversation about meaningful metrics. We ended up with seven measures that met my guidelines. The agreed-upon metrics
Were focused on measuring progress toward business goals—thus meeting the accomplishment-over-activity criterion.
Were few in number—otherwise the sheer volume makes them meaningless.
Motivated the right behaviors rather than being something used as a weapon against others. Too often we impose metrics to punish wrong behaviors rather than inspire improved performance.
Were designed to measure processes, not people. Meaningful metrics help us identify when processes, not people, need to be fixed.
Were simple to measure and simple to understand.
Good metrics are essential in enabling and encouraging fact-based action in business and can have a powerful effect in helping teams focus on business value and collaboration.
However, we have all seen many cases where ill-considered, bad metrics have caused counterproductive motivation and severe damage:
Mortgage salesmen who were measured and rewarded on sales rather than long-term profitability and sustainability are at the root of much of today’s world financial crisis.
Programmers measured on schedule rather than business value create poorly targeted code with low quality, which is costly to maintain and decreases revenues.
Support center operatives are measured on number of calls handled or call length rather than customer satisfaction. The consequence is abrupt termination of calls, resulting in increasing numbers of calls, unhappy customers, and loss of business.
Salesmen measured on individual product sales sell inappropriate solutions that don’t perform, damaging both the customers’ and the business’s reputation.
When Lou Gerstner took over as CEO of IBM, the company was tearing itself apart because of internal competition. At that time, business unit success (and executive reward) was measured independently for each division with the intent of motivating teams to deliver increasing value. A nasty side effect was that business units were effectively in competition for, and often in front of, the customer and not motivated to help each other deliver the best end-to-end, totally integrated solutions. Gerstner was told by many that he needed to change the metrics in order to rebuild an IBM-wide focus on customer success. Lou’s immediate response was that individuals needed to ignore the current dysfunctional metrics and collaborate to do what they knew to be right for the customer and for the business. Over the next several years measurements and reward schemes were modified to cover larger organizational elements until they were eventually based on the whole of IBM’s performance. This not only removed the motivation for competition with internal teams and individuals, but also sent a strong signal that collaboration to deliver business value to the customer was the highest priority.
Bad metrics can create boundaries, suboptimization, and conflict, all of which result in poor performance and lost revenue. Good metrics, which focus on overall customer and business value, naturally foster collaboration and ownership. So what are the good metrics? In this chapter, we explore what metrics will maximize the delivery of business value, support dealing honestly with ambiguity, foster trust and collaboration, and not take away ownership.
Before we get into a discussion about metrics we must start with the overall environment in which they are used.
The foundational requirement of good metrics is, yet again, integrity and honesty.
Plans must be honestly acknowledged as a goal, the most likely outcome, and not as an inflexible straightjacket.
Accuracy must not be expected or claimed where it does not or cannot exist. Effort estimates are just that, estimates with a significant degree of uncertainty.
Actions and owners must be focused correctly. For example, productivity measures are often used with the intent of motivating individuals to deliver more, rather than as a measure for the leader or executive who really has the responsibility to enable and support the team.
Care must always be taken to ensure that the common problems such as gaming the system and green shift (the tendency of the outlook and status to get better as they are reported higher up the organization) are countered effectively. Try to make the measures completely objective, and where that is not possible, consider how those asked to provide the data will feel and act, especially if they believe that the data will reflect on their performance.
Overall metrics must be seen to be honest and allow for honesty. Failure here will drive a huge wedge between the delivery teams and the business.
Development teams are often measured on the number of problems arising from the product’s use in customer environments. With one team, management wanted detailed explanations of minor variations in the number of customer problem reports, wasting large amounts of the leader’s and team’s effort. When the team tried to explain that this was statistical noise, they were overruled and told they were being unhelpful. The team soon learned that the executives involved were mathematically incompetent and more focused on demonstrating their own power than really improving customer experience. Pay attention to statistics. When numbers are small, variations are often statistical noise.
Note that even without integrity in the measurements, good individuals and teams will often still do what is correct for the customer and the business even if the measure drives them in the opposite direction. While this behavior can mitigate the problem, it still generates a feeling within the team that management is incompetent and untrustworthy. It is also difficult for individuals to sustain this position, and if management does not change, the team will eventually give up and conform to the bad measures.
Even well-intentioned teams will be worn down by poorly targeted metrics.
Results Not Process
On the electronic stock exchange project, the team was building a workstation for the traders. Remember, the traders were trading on the floor so we had no workstation to look at. The team built the first user interface prototype and set out to the first meeting with some traders.
“How did it go?” their manager asked.
“Don’t worry. We will reuse some of that code.”
“I don’t care,” was the reply. “I only care about one thing. When the system goes live, no trader throws a workstation out the window.”
After 20 years, no workstation has ever been thrown out the window.
The key measure here was customer acceptance and not internal development metrics.
Many project failures arise from measuring the process rather than the result or the outcomes. Take for example, following the plan, the worst measure you can make. Why? What if we learn the customer doesn’t want something? What if the customer wants something different? What if we miss something the business needs? The team may meet the date but it may have poor quality. Ownership is taken away from the teams and falls to the process.
Teams are essentially being told to “follow the process” rather than focusing on meeting customer and business needs. These types of measures disconnect the team from the business and customer and can result in products that no one wants or one that customers hate to use.
Process-based measures are always subject to gaming—getting around the measurement. If development effectiveness is measured by the amount of code delivered to test, you can be sure that the code quality will be poor and the overall defects and costs will rise. Much better to measure unit test escapes, which have a very significant effect on overall productivity. (A defect missed in unit test but found later will cost very much more development effort to diagnose and fix.) If process based metrics cannot be avoided then honestly consider how they will be gamed and what mitigations you can put in place. The more controlling the process is, the more it gets in the way of the team doing the right thing, the more it will be gamed.
The most effective measures are strongly aligned to outcomes such as business and customer results. No amount of internal measures can disguise actual customer satisfaction or real business results. Results or outcome-based measures cannot be gamed.
Whatever you do, make sure your metrics do not remove ownership from the delivery team in any way.
Learning Not Punishment
Someone once said it is only a failure if you don’t learn from it.
When metrics are used to place blame, individuals and teams will always present information in a positive light to avoid getting blamed. In this case the chance for making any needed improvement is lost and everyone is encouraged to game the system.
Base your metrics on learning. Every problem is an opportunity for improvement. Every failure is an opportunity to learn a better solution. Any new innovation and product can come from learning what customers really want, what will delight them.
If teams or individuals believe that the metrics will be used against them, you can be sure the data will be adjusted to make them look good in the eyes of the business.
If metrics are used for a purpose other than learning (often as some form of competitive comparison between teams) they will be gamed. Gaming this type of measurement system occurs all the time, even if the metrics are well chosen. If you must take this approach, make data input automatic if possible but otherwise as easy as possible without sacrificing data quality. Explain to teams that you are well aware of the possibility of gaming. Show them how gaming the metrics actually hurts the learning process and the business.
Some years ago, a company was trying to assess the deployment of specific agile iterative practices. Initially each team was polled and asked whether the practices were used. Results were generally seen to be good. Sometime later, as part of a wider assessment of Agile credibility and adoption, an anonymous survey was carried out, polling individuals across the company. Results were aggregated only at a major organizational level. With the fear of identification (and potential punishment) removed, the results were markedly different and very much lower. These indicated that much more action was needed to educate and promote the use of agile practices throughout the organization, while the “official” statistics were saying that all was well.
Measuring Culture Change
While metrics, in general, have a strong effect on culture, we will first focus on ways of assessing our progress in building an effective culture. We cover metrics in general in the next section. Here we are looking specifically at assessing the state of our organization, its leadership and culture.
As leaders work to change the culture within their organization they urgently need feedback on the progress they are making. And they need early indications to know when and how to modify their course.
How can we measure that we are successfully moving into the green?
Leadership effectiveness in enabling the change
Business alignment and purpose
Honestly dealing with ambiguity and complexity
Let’s take a closer look.
Measuring cultural change is difficult. We have identified the areas that we need to assess, but these are far from simple metrics. You will need a good degree of judgment. In the sections that follow we will identify simple metrics where possible. Where these are not available, we will identify a set of questions you can ask yourself that will give you positive and negative indicators of progress.
A common approach is to survey the workers within the organization and ask them to anonymously rate their environment for the five characteristics we’ve listed here. If it makes sense in your organization, we recommend the use of the Net Promoter Score (NPS) for this. This may notgive an accurate overall view of the situation, but it will certainly give the working-level view.
The Net Promoter Score is a common industry approach developed by Fred Reichheld1 and explained in a Harvard Business Review article . Full explanations and the rationale of the approach are easily found on the web, but essentially you ask the “customers” of a product or service how likely they would be—on a scale of 1 to 10—to recommend your products and services to their peers, friends, families, et cetera. Those who score the service 9 or 10 are considered “promoters” (they like your product/service). Those who score the service 1 to 6 are considered “detractors” (they don’t like your product/service), and those who score 7 or 8 are neutral. The NPS is calculated as the percentage of promoters less the percentage of detractors. This means that the NPS can run from +100% where everyone recommends your product or service to -100% where everyone thinks it valueless.
Another method to measure your progress is to run a collaborative sticky note session with the team. (See Appendix C, Collaboration Process.) In this session ask them what would make them more effective. This is the approach we used to gather the information presented in the introduction to the Trust-Ownership Model. Its power is that it will give a strong indication of the areas that need work without biasing the response by asking questions about specific areas of leadership behavior, business processes, or anything else. If improving trust comes at the top of your list, then that is what you most need to work on.
Leadership Effectiveness Metrics
The most critical factor in building a powerful culture is the effectiveness of our leaders. And the most useful measure of leadership effectiveness in promoting a new culture or approach is the leader’s Net Promoter Score (NPS). When we ask teams how much value there is in a new approach (i.e., would they recommend their leader’s approach to their peers?) we get insight into our progress in building an understanding of the value of the approach. As with any survey-style metrics, it is important to set up the survey so that individuals do not feel in any way pressured to affirm progress. The usual way to do this is to make the survey anonymous and publish data only at an aggregate level.
As IBM worked to promote the use of Agile across the company, it measured its progress on both mindset and deployment. The first measurement was an NPS on the agile practices being used. We set up a survey so that individuals could provide their opinions of the value of agile practices within their organization. One of the questions was, “On a scale of 1 to 10, would you recommend the use of short iterations to your peers?” We calculated the NPS from the results and used the scores to assess our progress in explaining the value of the practices to the development teams. A strong positive score indicated that teams valued and were likely to use the practice without further encouragement. Low scores indicated that the practice was not seen as having value to the team and showed us that we had more work to do.
We were initially concerned that there might be a bias—those who were passionate about Agile would vote while others would opt out of the survey. However, the resulting data showed a nice distribution of both positive and negative scores across the practices, suggesting that this was not a significant issue.
The cycle between survey and action was very short. Simply sharing the results with the owners of the major divisions immediately allowed them to take action to increase support and training where needed. The only practical limitation was survey fatigue, and so we decided to conduct the survey only once each year.
Even better, it cost almost nothing to conduct this survey. It was a single web page that could be completed quickly.
If you are organizing and conducting a survey, make it as attractive and as easy as possible. Resist the natural tendency to create a comprehensive, complex survey. The longer the survey takes to complete, the lower the participation and the harder to collect and synthesize the results.
In addition to identifying action areas, simply carrying out a survey demonstrates that leaders are engaged. Good leadership effectiveness metric examples include the following:
NPS surveys on practices. Are the new practices well regarded? Are they adding value?
Wider organizational acceptance of the new approach. How well do the business processes support the new practices?
As we rolled out new practices in several companies, we polled the development teams on the degree to which the organizational infrastructure, such as the business processes, finance, audit, IT, and legal, supported the new practices. We rated them on a simple 1-to-5 scale:
1. Requires the new approach.
2. Encourages the new approach.
3. Is neutral.
4. Makes the new approach difficult.
5. Prohibits the new approach.
This allowed us to work with the infrastructure areas to ensure that they were fully enabling the overall business initiative.
Bad leadership effectiveness metric examples include publicly asking teams and individuals if they are using the new methods. This almost always leads to gaming, as teams report what they believe management wants to hear.
How much trust does the leadership have in the professional teams? How do the teams see this trust? Do they believe that they are trusted?
This is difficult to assess. Great Place to Work2 uses a complex survey, but most of us need simpler measures.
As mentioned earlier, our approach of asking teams what would make them more productive and then looking to see where “increasing trust” appears in the list gives a good idea of what the team believes it needs.
Other alternatives you can use to assess the team’s view include asking the team members
Do you feel trusted (Yes or No or Some of the Time)
Do you trust your team members?
Do you trust the organization?
Do you trust the organization’s processes?
Here are some questions you can use to assess your trust (and the organization’s trust) in your teams:
What do you trust your teams to own?
What do you have to approve before teams can act?
To what extent do your teams decide what to deliver and by when, based on their own assessments rather than being told?
Do the teams feel that they can make decisions about what will be delivered and by when?
Do they track and manage their own progress?
Are your teams able to make significant decisions without separate management review, or are all decisions made by management?
Is trust trending up or down?
It is also challenging to assess the degree to which teams feel that they have ownership of the business commitments. You can do it best by looking at the behaviors of the teams.
Do your teams make their own commitments?
Do they meet their commitments?
Do they track and manage their own progress?
Do they have sufficient information to make those decisions in line with business needs?
Do they understand the business pressures and the needs of the business?
Do they understand the unique value proposition of their project?
Do they understand the business goals?
How well can they map their projects to these goals?
Dealing Honestly with Ambiguity
This is one of the most important and yet challenging measures, particularly because of our personal and organizational quest for certainty.
The easiest way to get the team’s view of this is to ask them using a version of the NPS: On a scale of 1 to 10, how well do you feel we deal honestly with ambiguity?
And there are some obvious questions you can ask yourself:
Do the business templates include assessments of the uncertainty in any estimate? How do you use these?
Do you make commitments without involving those charged with implementation?
Do your plans admit to no modification?
Is your primary measure of good governance how well people conform to a plan?
The Trust-Ownership Assessment in Chapter 3, Building Trust and Ownership, and Appendix B, Trust-Ownership Assessment, also provide a template for assessing your progress toward Energy and Innovation. Assess your progress often and look for trends.
In the introduction to this chapter, we gave a number of examples of the types of metrics that can compromise the effective delivery of business value by
Focusing on process rather than results.
Building competition between teams or team members.
Encouraging a short-term view.
Wasting time collecting and reporting ineffective metrics.
Encouraging leaders to build a command and control approach by reviewing detailed (and perhaps meaningless) metrics on a regular basis.
Bad metrics can be a wall that stands between your team and Energy and Innovation. In this section, we introduce a set of guidelines that you can use to evaluate your metrics and identify any that are getting in the way of trust and ownership. You can then eliminate or replace the metrics.
Having too many, and often conflicting, metrics disempowers the team and distracts it from working on actual delivery.
Let’s look at a couple of examples.
On taking over a pretty effective team that was using an iterative development process, one of our favorite directors, Nickie, asked that the team stop creating and sending her status reports. She told the team that there was no need to create any reports that they themselves did not consider useful. If she wanted detailed status she said she would listen in on the daily team meeting or have a look at the team’s dashboard. For tracking processes she would attend each iteration reflection to join in the discussion on what had been achieved compared to their goals, the overall project outlook, and the improvement actions being taken.
She made it very clear that the responsibility for delivery was theirs and that she needed to be consulted only if they needed help from her. This immediately showed an increased level of trust. She would rely on the metrics they were using to measure their progress rather than burden them with her own.
Another leader, at risk of having his team’s time wasted by a more senior executive and inveterate micromanager, protected his team by taking over the responsibility of collecting the data and presenting it to the micromanager. His view was that there was no way that he was wasting his team’s time creating metrics that they didn’t value.
Are Our Metrics of Any Use?
How do we make sure that the metrics we are using are effective and not just vehicles for giving leaders a feeling of power and control while not achieving anything useful? How do we know the metrics we want to create will be of any value?
Let’s look at some general rules of thumb for metrics:
The fewer metrics, the better. Choose those that will have the most significant impact. When these have made good progress, consider changing the metric in an iterative way
Minimize negative side effects. Remember that all metrics have strong side effects and that you must be aware of the potential negative impacts of optimizing for a specific goal. Select metrics that complement each other or counterbalance each other. For example, you can measure expense reduction to increase profit margins, but you run the risk of increasing time-to-market or decreasing quality.
People do what they are measured by. The underlying truth is that people’s behavior is always shaped by how they believe they are measured. Meaningful metrics result in meaningful and productive behavior changes. Meaningless metrics result in meaningless and counterproductive behavior changes.
Getting Useful Metrics, Removing the Rest
How can we tell useful metrics from damaging ones? In this section, we will look at whether your current metrics or any new ones you want to create are truly effective. We need to consider
What is your goal; what you are trying to achieve?
Is the metric valuable to the team that is using it?
How long will it remain valuable?
What is the cycle time for action based on this measure?
Are the candidate measures actually aligned with the business needs?
Do they build ownership with the team?
What is the true cost of collecting and analyzing the data?
What are the side effects of using this measure?
How could they be misused and damage our focus on value delivery?
Let’s look at each of these in more detail.
What Are We Trying to Achieve?
Before selecting any metrics, first consider what it is that you want to achieve, what decisions you need to make, and questions you need to answer. Use these goals to help you select the appropriate metrics. Never measure things just because you can.
This is a fundamental starting point. There are many metrics that we can collect and report but that are a distraction from where the focus should be. Many of these are labeled “vanity metrics,” a term coined by Eric Ries in his book, The Lean Startup . Vanity metrics are metrics that are not actionable. As a simple rule of thumb you should beware of vanity metrics.
Businesses need to focus. Large numbers of metrics cause loss of focus and paralysis among leaders. Start by deciding what your most important issues are, and then create metrics to help you improve in these areas. As each improves to an acceptable level, choose a new area for improvement and iterate.
Some years ago, a software development team was suffering from growing technical debt.3 As the debt grew, it became more difficult to assemble the product. Testing time was extended and quality still suffered. The leaders felt that business pressures to meet a commitment schedule were causing the teams to cut corners in unit testing. Because meeting the schedule was their primary measure of success, the quality was deteriorating as teams rushed through the unit testing process.
3. Technical debt is anything that makes it difficult for the team to add value. It includes untested code, unfixed errors, and open design issues.
To test their theory about the suspected cause and effect relationship, the leaders started measuring how much of the code submitted to the library was covered by automated testing scripts. They captured the data and, without announcing anything specific to the teams, started displaying the results on a wall that was visible to the entire team. Over the next three months—and without ever formally announcing an initiative to improve automated test coverage—unit test coverage increased dramatically and the problems with product integration evaporated.
It is interesting to recognize that the team members obviously knew that they were cutting corners but, until the metrics made their compromises visible, their motivation to deliver on schedule was stronger than their motivation to deliver validated quality. By creating and publishing the measure, the leaders clearly demonstrated that they valued validated quality and the team modified its behavior accordingly. The leaders wanted to achieve improved quality and they found a metric to hit that goal.
Is the Metric Useful to the Team?
Metrics are most powerful when the teams choose those that will benefit them the most. (Imagine the trust that this requires from the teams’ leaders!) Ask the team what measures would best help them achieve their goals. This not only increases the team’s ownership of the process but also prevents the selection of inappropriate metrics. You can also ask if there are any current metrics that are getting in their way and are walls to reaching their goals?
How Long Will the Metric Remain Useful?
Decide up front to avoid accumulating out-of-date metrics. Don’t keep using metrics that are no longer useful. Once the metric has served its purpose and you have achieved the goal, consider dropping the metric unless it is, over the long term, motivational. After we’ve solved the problem we shouldn’t need this measure until the end of time.
One team was having problems with code stability. During a retrospective they discovered no one was checking in code until the very end of the iteration, which did not allow for enough time to test. After some conversation, they decided to measure the number of check-ins per day. Over time, the team got into the habit of checking the code in several times a week. The code was then integrated and tested more frequently. As the code became more stable, they stopped measuring check-ins.
What Is the Cycle Time for Action?
Use the cycle time of the metrics to evaluate the effectiveness of the measure. The shorter the cycle the better. The sooner the team can get and incorporate feedback the better. Vanity metrics often have an infinite cycle time.
The two examples earlier are great examples of short cycle times. The check-in and code coverage measures were available daily, allowing the teams to respond immediately.
As an example of long-cycle-time metrics, think of waterfall processes with their end-of-project data analysis and review. Only at the end of the project does the team gather and assess feedback—way too late to make any impact on the project. Problems and actions the team identifies can, at best, only be implemented in the next full development cycle. Contrast this with the automated test code coverage example earlier. In this case the action cycle time was a single day. Results were gathered and published every night. Improvements were immediately visible.
Are the Candidate Measures Actually Aligned with the Business Needs?
Are the metrics fully aligned with increasing business value and collaboration? It is essential to honestly assess whether improvements in the metric will actually lead to the delivery of increased business value.
An organization we worked with used a typical Stage-Gate process for product development. They assessed product quality just prior to product launch. The company found that it was having difficult discussions about how many of the known remaining defects they should let loose on their customers. Perhaps a valuable metric to measure their product development process would be how often they completed a cycle without any known defects. This would align with a goal of avoiding the delivery of any known defects to their customers (which never seemed to make the customers happy or satisfied with their products)—a much better measure of customer value.
In teams using iterative methods, a common predictive measure is the technical debt in each iteration. A team using iterative methods for the first time focused on keeping this measure as low as possible. Their goal was to complete as many iterations as possible without any serious open defects. As the project neared completion, the normally difficult questions about whether or not the product was good enough to launch went away and they dramatically reduced the number of defects occurring at the customer site.
Do the Measures Reinforce Ownership in the Team or Remove It?
These metrics are similar to those for alignment. Do the metrics allow the team freedom to define and take the actions that they think are strongly aligned with the delivery of business value?
One company we worked with measured aggregate customer field problems from its shipped products. The numbers of problems each month rose and fell in a random way but the overall trend was generally favorable as the team improved its processes. Unfortunately, the executive responsible for the customer support team, in monthly review meetings, used the statistical noise in the data as a stick to berate the product teams—even though there were no sensible short-term actions the teams could take. The result was that everyone but the executive in charge felt these meeting were a complete waste of time. The product teams felt no ownership for the data or the actions forced upon them. In fact, the process devalued the focus on the customer experience.
In parallel with this useless activity (and unknown to the executive), the product teams spent time looking for and getting to the root causes on what created the problems. They then attacked these and measured those results. This measure, as compared to the meaningless metric, was highly useful to the teams and reinforced their ownership of product quality and the customer experience. Focusing on this metric delivered an ongoing improvement trend that was largely ignored by the customer support executive.
What Is the True Cost of Collecting and Analyzing the Data?
Always ensure that the cost of collecting the metric is significantly less than the value that it can deliver. And we do mean significantly less.
Recognize that the cost of collecting data is not zero and can sometimes be very high, especially if it involves continuous action by the delivery team members. This cost is often ignored and can have a huge negative impact on team productivity.
Some time ago, I was asked by a senior executive to analyze the amount of code and architecture reuse across our organization. We had a fairly good idea of the amount of reuse but I estimated that it would take more than 20 person-years to just collect the provenance of each piece of code in our libraries! So what was I going to do? Consume 20+ person-years of resource to collect incredibly detailed reuse data? Rather than do that, I personally did some sample analysis and validated this with a selection of seasoned professionals. I then returned to the senior executive and gave my report based on this high-level approach.
“Well, how much reuse do we have?” the executive asked.
“Quite a bit,” was my answer. “But it varies greatly between different environments.”
“Hmmm,” was his reply, “that’s what I expected.”
This high-level estimate was sufficient for the executive’s purpose and didn’t require costly gathering of detailed data.
As the data collection can be expensive, so can the analysis. We once ran a survey that generated over 6,000 write-in comments. Imagine the time to read, categorize, and analyze this unstructured data.
Remember to especially look for direct impacts on the development team. One team we worked with was doing a detailed “causal” analysis and categorization of every defect that they fixed in development. This involved answering 20 or so difficult questions about the root cause and characteristics of the defect. (Is this your definition of useful?) To ensure that this was done, the problem management system would not allow the defect to be closed until all the questions had been answered. The goal of this exercise was to give insight into weaknesses in the product, but it was totally ineffective. The time and cost to the individual developers were so high that many admitted to randomly selecting causes and characteristics (gaming the metric). In fact, the items at the top of the list of causes were the ones most often chosen (it took less effort to select from the top than to scroll down the list). It soon became obvious that this was a meaningless and expensive metric and the team decided they could provide more useful insight much more easily with a brief discussion between the developers as part of their retrospectives.
What Are the Side Effects of Using This Measure?
Look out for negative or unexpected side effects. Consider not only the intended goals but also how teams and individuals are likely to respond in both the short and the long term.
Leaders often evaluate the effectiveness of testing by the number of defects found by a set of independent testers. In this situation, testers are actually rewarded when they are given poor-quality product to test. They can then generate a large number of “found” defects that have the same underlying cause. By finding a large number of defects, they prove their value!
And why do the leaders use this metric? In theory, to motivate the testers to do more effective testing, which should lead to higher quality. And what is the result in real life? Measuring test teams this way is guaranteed to generate a highly confrontational relationship with the development team. After all, the development team is measured on zero defects. Thus begins the game. Was what the tester just classified as a defect really a defect or an artifact from an earlier version? What are the results of this confrontational relationship—confrontation driven by the side effects of metrics? Poor overall productivity, very low levels of morale, lower quality, and poor business value delivered.
In a real case, the leaders recognized this negative spiral and completely changed their approach. They abandoned this measure and brought the development and test teams together to enable effective collaboration to deliver quality. In this new arrangement, the testers sit next to and work with individual developers—even during unit test phases—and communicate face-to-face. The new metrics measure the growth in testing successes rather than defects found. Morale is higher and the whole team is working together to deliver increased business value.
Carefully consider the scope of a metric to ensure that it does not create suboptimal behavior.
How Could Metrics Be Misused and Damage Our Focus on Value Delivery?
With all metrics there is a potential for misuse. Sadly, management often pressures individuals to “enhance” data if they think the managers will be judged by the data. Deal with this up front, ideally by collecting data automatically from operational systems.
Many teams use velocity measures to assess and project progress. We recently worked with a team that was resizing their delivered stories after each iteration. We pointed out that this required significant work that did not add any value. After all, because they were adjusting both the delivered value and goal at each iteration, how could they assess their progress and improvement? In this situation, actual unadjusted velocity was a much better measure of their progress. The team agreed with us but told us that their management was misusing the measurement by rewarding or punishing them based on delivered velocity. To meet their velocity target, the team resized everything once they were done with their work. That way, their delivery velocity was always exactly on target—making the whole measure pointless.
Take a look at all the metrics in your organization that affect your team. Are they providing value or just getting in their way? Are there any metrics that would be useful to your team? If so, do they pass all the criteria described in the preceding sections?
The most damaging effect of metrics that add no value is that the team feels they are not trusted (they must be monitored on useless principles and waste time generating the data to show they are “effective”) and they don’t have ownership. Metrics, unfortunately, are easy to get wrong. And that comes with a high price tag for your organization, your teams, and your customers.
Please read the following at your peril. Many organizations feel that processes and metrics should be controlled and optimized by a central group. Such an approach removes ownership of process effectiveness, and the associated metrics to enable this, away from the delivery teams.
We do not like “metrics programs” and do not recommend them to anyone. Metrics programs can be a significant wall to trust and ownership.
But if you are being forced to implement a metrics program, here are some things to avoid. If you test the concepts of a metrics program against the criteria you have read in this chapter, you will see that such programs fail the “meaningful metrics” test in various ways.
Why Do Metrics Programs Fail?
Many metrics programs fail and many more do not achieve their intended goals. There are many reasons for this but some of the most common are
Unclear goals and vision of what the organization is trying to achieve.
Misuse of metrics for competitive comparisons or as a punishment rather than a learning process.
Using metrics for political or other purposes rather than using them to help people to do their jobs more effectively and to deliver more value.
Poor communication of the goals and intentions of the program to those involved.
Focusing on process and not results.
Too many metrics leading to lack of focus and ineffectiveness.
Measuring vanity metrics.
Measurements are not timely or the action cycle time is too long.
Poor leadership, which is focused on the metrics process rather than its goals and effect.
Not adapting to the changing needs of the business. Metrics will change over time and you need a mechanism to ensure they remain relevant and effective.
If you must have an organizational metrics program, then focus on using it to help individuals and teams learn how to take ownership of their own processes and the metrics for improving them. Focus on results-oriented rather than in-process metrics and work to help, support, and shield your teams.
We are often asked for a step-by-step guide, so here is a simple process.
1. Define the goals of your metrics.
Ensure the goals are fully aligned with the business.
Focus on learning rather than politics.
Enable action at the lowest level possible.
2. Involve the key players in deciding which specific metrics are going to be the most effective. Make sure you include those who are expected to take action.
3. Select the metrics you will use and how you will use them using the metric selection process we described earlier.
4. Communicate the metrics and goals of the metrics to all the teams. Explain why the metrics were chosen and how they will be used.
5. Regularly review the effectiveness of the metrics and be ready to change them regularly.
As we work to improve the trust and ownership in our organization, we need clear indicators of progress so that we can focus our activities for best results.
The most valuable metric you can use is the team’s view of the progress you are making.
Are you, as a leader, effectively supporting and enabling the move to improved trust and ownership?
Does the team feel trusted enough to take effective action without asking for permission or approval?
Is the team able and willing to take ownership of their deliveries?
Is the team purpose aligned with the business goals?
Does the organization deal honestly with ambiguity?
Using these metrics will enable you to focus on the most effective actions to take to move forward.
Using metrics effectively is much more difficult than it appears to be. Badly chosen metrics can mislead and drive counterproductive behaviors. However, well-chosen metrics can be a powerful enabler and motivator of effective action.
 Reichheld, Frederick F. “The One Number You Need to Grow.” Harvard Business Review. December 2003.
 Ries, Eric. The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. Crown Business, 2011.