Screening Tests: What Can They Add?

I recently had a great opportunity to work with an international consulting firm. They wanted to hire about 2,000 to 2,500 new programmers in the coming year. The business plan included hiring temporary employees with no prior programming experience and requiring them to attend an intensive two-month training program. Successful graduates would be eligible for full-time employment. Although this study applied to programmer trainees, it could be applied to any position requiring problem solving, learning or decision-making. To help them choose the most qualified applicants, we developed a Programmer Aptitude Test we called the PAT (real creative, huh?). This test included items that represented the kinds of issues faced by programmers when deciding how to attack a project and how to troubleshoot code. The PAT did not require ANY prior knowledge of programming languages. It used code-like questions to measure an applicants’ ability to think in abstract terms as well as to apply logical thought. So far so good. But developing a decent test is not as easy as just making up a list of questions. We actually went through several stages of development before our test was ready for prime time:

  1. We looked at job requirements and used that information to generate about 80 multiple-choice questions.
  2. We gave these questions to about 100 people, examining each test question and possible answers to see what was too hard or too easy.
  3. We edited the list and repeated the process until we were sure each test item and answer choices had the right level of difficulty for use as a hiring test.

While we were developing the PAT, company recruiters continued selecting applicants through a traditional hiring process that involved resume screens and interviews. People who passed were enrolled in one of four training classes. While participants were attending class, each one took the PAT online (it’s important to remind the reader that participants had been already been hired and expected to pass the training course based on interview data). Anyway, we then sat back and waited for the results to come in. Training success was measured using scores on two programming tests, one at midterm and one final. Successful graduates had to pass both to be eligible for a full-time position. When we compared PAT scores with training results, the correlation* between PAT scores and the first training test was .380 and .460. These were strong numbers. They showed a validated PAT test could be used to predict training results significantly better than interviews. But what do these numbers mean in practice? You know, after all the mumbo-jumbo is extracted. Well, let’s take a look at a table. Based on our initial group of 107 people, we developed a table that could predict hiring and training results based on 100 applicants who passed an interview and resume screen prior to taking the PAT. Caution should be given to interpreting the AA minority information because the numbers used to calculate the data are so small (22 Asian, 14 black, 4 Hispanic, 64 white, 3 not supplied). Expected Results per 100 People

Overall Combined Minority AA Only
PAT Cut

Score

# Passing

the PAT

# Grads

Expected

# Passing

the PAT

# Grads

Article Continues Below

Expected

# Passing

the PAT

# Grads

Expected

30 66 people 50 people 53 people 33 people 35 people 21 people
35 54 42 41 26 35 21
40 39 33 23 18 21 21
45 32 29 13 10 14 14
50 24 22 5 3 n/a n/a

Interpreting the Table Using a Cut Score of 35 Out of 100 people who passed an initial interview and resume screen, a cut score of 35 should result in 54 people being hired and 42 training program graduates. Out of 100 combined minorities, 41 should pass and 26 should graduate. Of 100 AAs, 35 should pass and 21 should graduate. Balancing Legal with Social Issues What do we do with the minority data? Do we lower test scores and have more people fail training? Do we do a better job filling the recruiting funnel? This is not legal advice, but the law does not currently force any organization to hire all the people who apply, nor does it force an employer to hire all minorities who apply. It advises organizations not to set hiring standards higher than required for the job (that goes for interviews and resume reviews as well) and look for ways to get more minorities into the system. In our study, low scorers were an equal opportunity group; that is, all people who scored low tended to fail more often than people who scored high. We thought this was pretty compelling evidence of a participants’ ability to learn a programming language regardless of race, gender or age. But now we have to decide how to balance business need, job requirements, and social issues. In other words, what cut score would give minorities a better chance without seriously compromising employee quality? A quick look at the chart shows this is somewhere between 30 and 40 on the PAT. We don’t know for sure, because:

  • There are only 14 AAs in the sample.
  • We think people in the combined minority group had trouble with English as a second language.
  • We need about 100 participants from each minority group to be sure of our data.

Conclusion Properly designed and validated tests can play a major role in predicting training success by contributing significantly more information about an applicant’s ability than interview data, resume screens, and background checks combined. For this organization, making the PAT a “first hurdle” in the application process would relieve a considerable interview burden from the recruiters and would tell them much more about the ability of the participant to successfully pass their training program. On the other hand, recruiters would have to screen more people to fill seats (sorry, this part never changes). When converted into dollars, the organization should enjoy a considerable reduction of training expenses accompanied by a considerable increase in employee competency. This benefit is not limited to learning software languages or working in consulting companies. Almost any position requiring problem solving or learning would benefit from more-accurate hiring tools. *Techie data: N=107, p <.01, z-score transformations, point-biserial correlation

Topics

Leave a Comment

Your email address will not be published. Required fields are marked *