Correlation Does Not Imply Causation

As we prepare for a new year, and as I look forward to preparing for a metrics panel at the Spring 2012 Expo, I have been pairing a series of thoughts on metrics and measures that are important to talent acquisition.

For the past several months, my team has reviewed dozens of articles, blogs, and white papers that outline foundational and basic aspects of “How to do Metrics.” There is a tremendous resource available by simply using search engines to find information on metrics.

I am encouraged by the amount of content that is dedicated to subjects such as what metrics can be tracked, the quality of hire conversation, the candidate experience, and how metrics can serve as a stepping stone to a real relationship with business leaders. I will also admit that the meat behind many of these blogs, articles, or white papers is pretty lean, but there are exceptions. Shout out to Chris Brabic at Smashfly for his tutorials that break into some of the detail.

As I prepare for the metrics panel for the spring ERE conference, it occurred to me how statistics and analysis tends to not be standard training for recruiters. There are some recruiters who were engineers, programmers, or MBAs, and as such they would have some basic to intermediate statistics training. But it is likely that statistical analysis or training is likely reinforced by using Excel with tables, pie charts and graphs — not using the actual definitions, architecture, and structure of true statistical analysis.

Which brings me to this post, and the danger of correlation and causation. It is not new to hear that metrics, when pulled together and compared to each other, tell a story. Much of that story has to do with correlation. As an example, if you spend more money (increase cost per hire), you may reduce your time to fill. Well, sometimes that is true. Sometimes.

That relationship may not be a causal relationship: One does not necessarily cause the other. The dependence that we wish was there is actually not there in the strength that we need it to be, or even at all. There is a common scientific and statistical concept that states “correlation does not imply causation.” I find that to be very true in recruiting and talent acquisition metrics.

Article Continues Below

We try so hard to find how one metric impacts the other. Technologies, branding companies, consultants, and so on use metrics to drive home value — and they should. We all try hard because we just really want to sort out why things are happening and what can we do to change what is happening, and that is a worthy endeavor.

However, I caution trying to correlate metrics together in order to force causation. It is more likely that two or more metrics correlate and have less of a causal relationship then having a causal relationship.

As you review your metrics and measures for 2012, I encourage you to:

  1. State which metrics you are correlating together, and challenge yourself to see if you are hoping for a causal relationship, or if a causal relationship actually exists.
  2. Prove that the causal relationship has validity and can be repeated time and time again.
  3. Go back to your executive presentations and record where you did indicate that correlations and causal relationships exist. Remember that those statements are now out there, and it is possibly expected that the causal relationship will sustain.
  4. As you create or refine goals for your recruiting teams or the hiring managers, be aware of these causal and non-causal correlations, as it will help you declare and meet expectations in the marketplace.

Happy metric-ing, and see you at the Spring ERE!

Andrew Gadomski is the founder of Aspen Advisors. Aspen is an efficiency consulting firm that services HR and talent functions at companies with globally and/or socially responsible values. In addition to leading transformative efforts at Aspen’s clients, he is also on the faculty at New York University on the use of online and social networking tools for career management and advancement.


25 Comments on “Correlation Does Not Imply Causation

  1. Great article. My view is that this is a “201” subject (or maybe even “301”) whereas most recruiting organizations struggle to get core operational data and metrics in place (the “101” or first order hiearchy of key operational metrics.

  2. I enjoy when these discussion pop up every so often, as there is an ongoing and constant battle to explain causation. This is a great article and if it moves just a few more people and organizations to understanding, then I’ll be a happy camper!

  3. Great thoughts here.

    One thing I’ve been mulling over recently is the fact that various candidate attributes can be equally correlated with success, but that some command a much higher price than others. College education, for example, might be a sufficient condition, not a necessary one, for long-term success.

    Could you justify hiring a non-college grad with excellent structured thinking skills and quantitative ability for a general analyst position? They’d certainly be cheaper to hire on the open market. In this case, is a college degree a real job necessity, or just a CYA by the hiring manager?

  4. Geordie… a problem with this whole discussion is that one requires a really large sample sizes to understand if there might be meaning in aggregate from understanding correlation and causation.

    On the margin, even with substantial probabilities, any model won’t be very predictive. What I mean is that for any given hire, or small number of hires, of course one could justify a non-college grad hire. There are certainly some non-college grads that will outperform college grads. It’s only if we study large groups of data that we can find whether a collection of hires of non-college grads is less effective than a similar collection of college grads.

    This is not a knock on Adam’s article… but one problem with these discussions is that for small sample sizes (less than many, many hires) the models are not that useful.

    Metaphorically, it’s a little like playing craps or rolling two die. The probabilities become true after many, many rolls and only after many roles of the die do the rolls begin to resemble the probability distribution curve. But on any one roll, or even a good quantity of rolls, the values can be far outside the probable distribution. So one might roll the die 25 times and not a roll a “7”, even though the probabilities suggest that this is unlikely.

    Underpinning this is really a discussion on standard deviation.

    But like the die-rolling example, it’s only if you roll the die a large number of times that the patterns become clear and move towards the probabilities.

    The same is true with hiring volumes. You could hire 15 non-college grads who ALL outperform another collection of college grads. It’s only if you hire much larger quantities that it becomes meaningful.

    And that doesn’t even get us started on Human Performance Technology and the drivers of success in role. Hire those same superstar non-college grads and put them in the wrong environment or with a poor people manager, and the probabilities become even more meaningless.

  5. Jason,

    I absolutely agree. However, I suspect firms can benefit substantially (on aggregate, in the long run) by identifying undervalued candidate attributes — i.e. the “Moneyball” approach. Please accept my apologies if you know the whole story, but here’s a summary:

    One of the baseball sabermetricians’ breakthroughs was to understand the WORP (wins above replacement player) value of a prospect as a function of various statistics. Before the rest of the league caught up, Billy Beane was “buying” wins at a much lower price, because he would target recruits more intelligently than the competition.

    Take a .300 hitter versus a .250 hitter. What if the weaker batter forces opposing pitchers to throw twice as many balls per at-bat? The extra wear on the pitcher will result in more wins for the team — maybe even enough to outweigh the batting average difference. But the market doesn’t care about pitches-per-at-bat, so you can pick up these guys for far less money per WORP.

    I don’t mean to limit this to college degrees — it’s just a convenient shorthand for “things everyone thinks are great”. I’m simply wondering aloud if there are measurable “things nobody knows are great” that could allow one to widen the base of acceptable candidates.

  6. Thanks, Andrew. “Correlation Does Not Imply Causation” may not be true in the real world, but it certainly is in the world of recruiting. Not only is this true in recruiting but also:
    1) “The plural of anecdote is data” and
    2) “Opinion from a ‘thought leader’ equals fact.”

    Can you imagine how many sr. executives (both in Corporate Leadership and Staffing) would be embarrassed by having to now base their their recruiting on what might be called fact- and results-based “Generally Accepted Recruiting Practices” or GARP? Could you conceive some other area of the company saying: “No, we’re not going to use what has been shown to work or not to work; we’re going to do Accounting the way the founders want us to”, or “Our Legal Department is not going to follow working legal practices; we’re going to do what the speakers at the legal convention (who used to be lawyers but haven’t practiced law for years and now spend their time as non-attorney legal advisors) tell us to do based on their own opinions instead of updated law?” Well, that’s what we have in Recruiting.


    Keith “Let’s Work on Developing Generally Accepted Recruiting Practices” Halperin

  7. @ Steven – thanks! Hope to see you in San Diego. If you want to contribute prior somehow, let me know!

    @ Jim – thanks as well. I am in for that camp out too!

    @ Jason – you always call me Adam – don’t sweat it…happens all the time 🙂 I agree that the sample size is key to avoid a strong +/- percentage on your forecast. You and I have the luxury of seeing dozens of company’s data each year, and peeling it back, many of those companies hiring in the 10,000+ range annually. Most companies because of their sheer volume can have wide variance. That being said, its even MORE important to take caution to this topic. A slight indication of anything is more likely to mean nothing when the sample size is small.

    @ Geordie – I am HUGE baseball fan, and love the stats. Not surprising given my love of data. In the Moneyball fashion, I do think there are key stats, but they are not as general as we would like, and probably can’t be used across the board. For example, in sales – find out the on base percentage (OBP) and # of runs scored i.e. how often sales people turn demos into proposals (or similar) and how many deals they land. I find it odd how those types of questions don’t turn up as much as say “where did you go to school”. Different jobs have different Moneyball stats (like the famous walks or OBP) so you can challenge yourself to find them for each position, but just like Billy Beane, you will find resistance in your logic from stakeholders who are used to the status quo. PS – if you like Moneyball, make sure you ready Outliers.

    @ Keith – Amen. Love the idea of GARP. SHRM is actually progressing this nicely (see Cost per Hire standard and others) and the pipeline for those metrics is wide and deep. Advocacy and adoption of such standards will great improve the status quo. I am involved in that movement, and welcome some help, so give me call and we can GARP together.

    @ Keith

  8. Yeah, that’s it Andrew.

    Keith, on the other hands, I never get confused about. Probably because I always think “Keith Richards” when I see his posts.

    Again, not sure why.

  9. @ Jason: Maybe because I’m the opposite of Mr. Richards “coolness”.

    Happy Friday,
    Keith “Accept No Substitutes” Halperin

  10. Great conversation, and as a certified Six Sigma Black Belt lurking in the recruiting world, I love to see statistics effectively applied.

    To add my two pence worth: the reason we collect and report metrics is to drive change. The number one question to ask when deciding a metric package is “what in my organisation do I want to change?” Where is the pain? I have always found when I work backwards from that question the metrics created are naturally causal.

    Another basic, in my opinion, the metrics data capture has to be simple. If you aren’t already doing it in your organisation the information couldn’t have been that critical. Big flashing yellow lights — if you have to create an excel spreadsheet to capture the data to report the metric, you’ve got the wrong metric!

    Finally, start with these four categories: People, Quality, Velocity, and Cost. If your metric doesn’t fit into one of these categories, it probably isn’t very meaningful.

    Happy Reporting!!!

  11. Very well said. I am a strong proponent of metrics as long as I am not the person compiling and calculating them. My rule of thumb is: no-source, through-source, or out-source these types of things if they consume more than 5% of your time….

    Also IMHO, instead of driving change, metrics are frequently used as a means to prevent change and to reflect the aims and goals of the people requesting the metrics. Also, metrics which might show a radical divergence from the accepted status quo tend to be either not requested in the first place, discounted, or ignored. (EXAMPLE: The recent financial crisis.)



  12. @ Lisa re: driving change – this certainly is an excellent platform for how to engage metrics. I agree that you can work backwards to where is the pain.

    @ Lisa re: categories – I like those. I might steal those 😉

    @ keith re:preventing change and reflect goals – ANOTHER good platform, and I am glad you said something conversely to Lisa.

    A strong metrics platform has several pillars it is built on – not just one. Tracking and engaging metrics should be setup to be painless, but the analysis should be taxing. Track against initiatives, desired change, desired status quo, performance, quality, excellence, consistency, and so on. If you setup a metrics platform where the data flows easy, and desired target ranges are occurring, then the dashboard is green. However, when a metric falls outside the targeted range, the more data you have that could correlate, and potentially yield causation, the better.

    Done correctly, a metrics program is eventually built wide and deep (start small of course), and in many respects it eventually seem like overkill. But if its painless to collect and track, then its just about up front investment. It pays off when something “is wrong”. If you have a metrics platform, and still can’t figure out what may be occurring, or at least don’t have leading indicators backed up by data on what may be occurring, then it could be that the platform that was created is too narrow (or there is lack of analysis skills).

  13. Thanks, Andrew. When you said: “If you have a metrics platform, and still can’t figure out what may be occurring” do you mean that (while statistically significant) the results aren’t clearly open to interpretation (which seems like a failure in upfront planning: “What/why do we want to measure something, and what does it mean if we do?”), or that the results aren’t statistically significant?


  14. I meant it to be simpler than that Keith – sorry i did not clarify. I mean if you measure say 5 things, and you are hitting all targets but something is still off, perhaps you should start measuring other pieces of data to help identify the issue. That’s what I meant by too narrow of a platform.

  15. @Geordie:

    We actually looked at this when I was at Google: What the potential pre-employment predictors of future success at Google. We looked at a huge amount of potential independent variables and a good collection of potential dependent variables. We then did a big multivariate regression analysis to uncover any correlates.

    The Moneyball analogy is reasonable, but suboptimal mostly because success in baseball can be distilled into key performance indicators (like the famous “on base percentage” that Billy Beane correctly interpreted as being superior to slugging percentage or even batting average). The issue is that on the job performance, outside of the baseball diamond, is much more difficult to quantify into discrete measures.

    This type of thinking and statistical modeling could (and should) be applied to job segments with clear distinct performance measures / success criteria, but even this is dicey, because of the environmental variables. The classic example is sales, which is easy to measure, but how do you account for the guy who gets the tough (or easy) territory and therefore performs much differently than the mean?

    So this becomes a theory / practical discussion really quickly.

    @Andrew (aka Adam Levine): I disagree that the analysis of metrics should be taxing. If you have the right metrics defined; the actionable business intelligence, it should not be taxing. I am also not sure the wide and deep is the de facto answer: it’s really a marginal cost, marginal benefit equation. Few corporate departments run so efficiently that the corner-case, abstruse metric allows them to weed out the waste to the nth degree. I see many, many organizations chase the corner cases when the core metrics are left unfiltered, diagnosed, or interpreted. So wide and deep creates a signal to noise ratio issue that most corporate HR departments don’t cope well with.

    Actionable business intelligence is the key, and by definition, is a measurement of the things that matter.

    Good discussion.


  16. Jason, its true that too wide and deep is a distraction for many. The analysis should be taxing, as in that is where you put in the time, versus collecting the data. Pull and view data in seconds, analyze for minutes, optimally. Plenty of cases where it people hours to collate data – too many 🙂

    Wide and deep is of course relative. If you recruiting metrics program is twice as robust as the sales teams metrics, there may be some perception problems with leadership. My take is to be appropriate (whatever that is) plus a little more. Should feel right internally and as you point out hit the business intelligence that is actionable. When you need to dig deeper, then do that.

  17. @ Andrew: Thanks. We should talk some more offline.
    @ Jason: How did you define “success” there?


Leave a Comment

Your email address will not be published. Required fields are marked *