An engineer’s take on improving productivity of knowledge workers: Part 3 (Math)
In the final part of this series, I am going to focus on the precision of insights. Math is our best friend when it comes to precision, but it is really important to first define the problem space well and structure the solution space before we start seeking precision (refer to part 1 and part 2 for context).
Let’s continue our discussion (from part 2) about the situation where a company needs to increase the percentage of high performers in order to meet the business goals. We leveraged the cartesian method to break down the underlying components and then used systems thinking to understand the order in which these components should be optimized. The next step would be to figure out which actions have the highest leverage.
We can start by identifying a group of existing associate who represent the type of talent we want more of. In a typical company the next step could be to look into what schools or companies did these associates come from. And a conclusion could be drawn that we should hire more from those schools and companies . This is an example of univariate analysis (one variable at a time). Two main issues with univariate analysis are misleading insights and lack of conviction.
Misleading insights: You may see that there is a difference in performance across a variable but the difference is actually driven by another underlying variable. It is also possible that the effect of an important variable may be masked by variations caused by other variables that impact the outcome. The charts in Figure 2 show the impact of manager’s level on the performance of early career associates. If we only looked at the univariate analysis (left chart) we may conclude that the level of the manager doesn’t matter. But once we control for other confounding factors like associate tenure, prior ratings, performance in the selection process, associate engagement and more we realize that the level of the manager matters a lot.
Lack of conviction: The big difference between hypotheses, best practices and insights is the level of conviction you have (or should have) about taking action based on them. In people space almost everyone has strong opinions about what makes individuals and teams more productive. Some people believe in the greatest leverage is in an inspirational manager, others believe it is in learning and development, psychological safety, and so on. Leaders do not (or should not) feel confident enough to make big bets based on such hypotheses or best practices. Hypotheses usually come from personal experience but there is a question about broader applicability. Best practices come from academic research studies or case studies from other companies, but there is a question about their applicability to your company. For example Google published an excellent study called Project Aristotle in which they found that the #1 driver of team effectiveness at Google was psychological safety. Improving psychological safety is a no regrets move; but it may not be the highest ROI investment. It is possible that psychological safety is #6 or #10 most important driver of team effectiveness in a your company. As an example, let’s take one variable that may be different at Google compared to a typical company. It is possible that the variation in talent / skill levels at Google is relatively low compared to other companies, given their ability to attract top talent. This would cause that variable to not show up as an important driver. But at a typical company that variation can be very large and it is possible that skill level of team members is the top driver for a company. Assuming that there is limited investment of time and money available, that company may be better served by investing in hiring and upskilling talent than investing in improving psychological safety. So it is understandable that the conviction to invest based on best practices can be low. The conviction drops even further if the idea seems counter intuitive. Ironically these are exactly the type of insights that can help a company stand out from it’s competition.
We evaluated a very expensive people leadership program designed to help employees, with elite subject matter expertise, become better people leaders. This program had glowing NPS year over year. What if I told you that target employees for this program should not be those who need to improve their leadership; instead it is associates who are already great people leaders? I don’t think most decision makers would go for that recommendation.
But when we compared the leadership metrics (survey scores, attrition levels on the team etc.) of participants of the program to other associates who did not attend the training but were similar in other dimensions like tenure, department, team size, co-location with the team etc; we realized that the program had little to no impact on improving leadership metrics. It did, however, reduce the chances of attrition of the participants by 75%. The net effect of the program was to keep associates with leadership gaps with the company longer and in important leadership positions. The conclusion could be that this is an expensive and ineffective program which should be shut down. But this program could be better utilized as a lever to reduce attrition of great leaders you want to keep at your company. In that context, this is highly effective and inexpensive program. On the other hand current participants of this program would be better served attending a different program which has a more positive impact on developing leadership skills. In order to make such decisions we need to show what works and doesn’t work in the specific situation in question and have high certainty in the insight.
Approach to arrive at the right insights with conviction:
Multivariate analysis is a better way to achieve precise insights. Even a simple regression model can go a long way towards improving precision and level of conviction about insights. Figure 3 is an illustrative example of all the different types of variables that may affect performance of an employee. Any information available along these dimensions should be used to build a model which aims to differentiate the high performers from the overall employee population.
Benefits: There are at least three types of insights you can arrive at through multivariate analysis
- How well can you predict the outcome?: Based on all the information available to us we can understand how well we can predict the outcome. Metrics like accuracy, precision, recall etc. can give us the quantification of the level of certainty which in turn should impact the size of bets we make.
- What matters and what doesn’t?: The model can tell you the relative importance of different variables in predicting the outcome you are looking for. It can also tell you which variables have little to no impact on the same outcome. A univariate analysis may have indicated that candidates from certain schools / companies tend to perform well at your company. But a multivariate analysis may show you that if your account for the candidates performance in the interview process and certain characteristics of their first manager then where they came from doesn’t matter. Such conclusion may entirely change your strategy from targeting certain schools / companies at sourcing stage to welcoming applications from wide range of applicants but tightly managing a high bar during selection, matching new hires with strong managers and upskilling other managers on dimensions important to success. This approach can avoid the trap of the team taking actions based on the hypothesis of the loudest voice or most senior voice in the room. When we we have new hypotheses, we can add the new variable to the model and check if it turns out to be significant after the other variables in the model have been accounted for.
- How much one variable matters when you hold other variables constant: Burnout is a big challenge everywhere these days. The best way to fight burnout is figuring out where the relative leverage is. Usually 80% of the value can be obtained with 20% of the effort but the question is which 20%. A model can tell you how much does one variable impact the outcome vs another. We can leverage this information to inform the level of effort / investment that can be allocated to particular initiative.
Challenges: There are two common challenges to taking multivariate approach
- Data Issues: We may find that we lack information on some of these variables or the information is not captured in a systematic / reliable fashion. This can be a good opportunity to start collecting the data for the variables in question. Taking a structured approach to problem solving as we have discuss makes it more likely that we will identify additional data that needs to be captured vs. starting with the data you have an looking for answers in that data. Even for the data we collect there is significant time investment needed in making sure that the data is ready for model build. As a rule of thumb 80%+ time in the model build process can go towards data prep. This may not feel glamorous work but it is critical for getting any grounded insights from this approach.
- Sample size: As the complexity of the problem increases the need for sample size increases as well. A start up or a small company will not have enough sample size to build complex models that are specifically tuned to that company. In these cases we can rely more on structured problem solving coupled with judgement and research to get to a good solution. The other solution would be creating syndicates i.e. groups of companies that come together to share anonymized data with a third party with a belief that the aggregate information and insights based on them will be helpful for everyone.
Risks of Multivariate approach: We should be mindful of some risks, even when there is sample size and good quality data
- Overconfidence: These days it is easy to build a model using python which fits your data. It is important to have the right level of confidence about the model and metrics like accuracy, precision, recall, F1 score etc can help us gain some understanding of the right level of confidence. But the mathematical representation can fill us with false sense of confidence, especially if the model confirms our hypothesis. It is important to follow good model building practices and understand the difference between correlation and causation.
- Failing to incorporate intuition and judgement: In general it is important to bring in understanding of how the world works (psychology, economics etc.) to complement math in arriving at the proposed solution. This is especially important in people space where harmony between logic and intuition is a key to success.
As I mentioned at the beginning, I don’t have the theory of everything as it relates to the productivity problem, but I wanted to share the frameworks and tools I am using to tackle this problem. I hope to hear from you if you see ways to improve it, believe that the approaches above are entirely wrong, and/or have a different way to go about it.