Watson Glaser Critical Thinking Test 2026: Sections, Scoring & How to Pass

What the Watson Glaser critical thinking test is, its five sections (5/12/5/6/12), timing, the raw-score-to-percentile table, worked examples, and where it fits a consulting application.

Updated Jun 28, 2026Reviewed by Road to Offer
On this page

The Watson Glaser critical thinking test is, in 2026, the most established critical-reasoning screen in high-stakes graduate hiring, and according to AssessmentDay it carries more than 90 years of history with its most recent major revision published in 2011. The current Watson-Glaser III is item-banked and, per AssessFirst, runs to 40 questions in roughly 30 minutes, about 45 seconds each, split across five sections: Inference, Recognition of Assumptions, Deduction, Interpretation, and Evaluation of Arguments. It is honestly a law-firm test first, with UK firms such as Clifford Chance, Freshfields, and Hogan Lovells using it to gate training-contract applications. But the skill it measures, structured reasoning under time pressure, is exactly what consulting and finance recruiters probe through aptitude tests and case interviews. This guide consolidates the section breakdown and the raw-score-to-percentile table that competitors scatter across pages, then shows where the test fits a consulting application and how to clear it.

What the Watson Glaser Test Is and Why Screeners Use It

The Watson Glaser Critical Thinking Appraisal is a psychometric test that measures how well you reason from written information: whether you can separate what a passage actually supports from what merely sounds reasonable. It does not test general knowledge, vocabulary, or numeracy. It tests judgment under constraint, which is why employers who need to filter large applicant pools before expensive interviews lean on it.

Per iPrep, the test is used predominantly by law firms, especially in the UK, where it screens training-contract and vacation-scheme applicants. That is the honest framing most consulting-focused candidates miss. Watson Glaser is not an MBB staple in the way the case interview is. But its underlying construct, critical reasoning, is the same competency that consulting and finance recruiters test through verbal and numerical aptitude batteries and, later, through the case itself. If you are applying across law, consulting, and finance grad schemes in the same cycle, you may well meet it, and the reasoning habits it rewards transfer directly to a case interview.

The practical takeaway: treat Watson Glaser as a reasoning-discipline test, not a trivia test. The candidates who struggle are usually the ones who import outside facts or rush past the exact wording. The candidates who clear it slow down just enough to judge each item on its own terms.

The Five Sections and the RED Model

Watson Glaser five sections and RED model diagram

The Watson Glaser is built from five sections with fixed question counts. On the 40-question Watson-Glaser III, the breakdown reported by AssessFirst and The Lawyer Portal is:

SectionQuestionsWhat it testsAnswer options
Inference5Judging the probable truth of a conclusion drawn from stated facts5-point scale
Recognition of Assumptions12Spotting unstated things a statement takes for granted2 (Assumption Made / Not Made)
Deduction5Whether a conclusion necessarily follows from the premises2 (Follows / Does Not Follow)
Interpretation6Whether a conclusion follows beyond reasonable doubt from a passage2 (Follows / Does Not Follow)
Evaluation of Arguments12Whether an argument is strong (relevant and important) or weak2 (Strong / Weak)
Total40Critical reasoning under time pressureAbout 30 minutes

Modern Watson Glaser scoring groups these five sections under the RED model, the framework Pearson uses to report performance. Per iPrep, RED stands for three subscales:

  • R, Recognize Assumptions. The Recognition of Assumptions section. Can you detect what an argument quietly assumes?
  • E, Evaluate Arguments. The Evaluation of Arguments section. Can you tell a strong, relevant argument from a weak or trivial one?
  • D, Draw Conclusions. The Inference, Deduction, and Interpretation sections combined. Can you reason from evidence to a sound conclusion without overreaching?

Knowing the RED grouping matters because firms sometimes report your three subscale results rather than one raw number, and because it tells you where the weight sits. Recognition of Assumptions and Evaluation of Arguments are the two largest sections at 12 questions each, so half of the test is about reading critically rather than doing formal logic.

Format and Timing: Watson-Glaser III vs the Older Forms

There is no single Watson Glaser format, which is why prep pages disagree. The version you sit depends on the employer.

The current standard is the Watson-Glaser III (WG-III), which iPrep describes as item-banked: questions are drawn from a large pool so two candidates rarely see an identical paper. It is 40 multiple-choice questions, typically completed in about 30 to 40 minutes depending on the employer's setting. AssessFirst puts the working pace at roughly 45 seconds per question for a 40-question, 30-minute sitting.

The earlier Watson-Glaser II used fixed Forms D and E rather than an item bank. You will still see references to longer papers: The Lawyer Portal notes the test can appear as 40 or 80 questions, and AssessmentDay lists common formats of 30 questions in 40 minutes or 80 questions in 60 minutes. Employers also set their own windows, frequently between 30 and 60 minutes.

The implication for prep is simple. Do not anchor on one number. Confirm your question count and time limit from the invitation email, then practise at the matching pace. If you only know you are facing "the Watson Glaser," prepare for the 40-question, 30-minute WG-III, because that is the most common modern sitting.

Worked Examples for All Five Sections

Watson Glaser worked examples map for all five critical thinking sections

Each section has a distinct trap. The examples below are illustrative, built to show the answer logic and the most common mistake, not real test items.

Inference (5-point scale)

You treat the passage as true, then rate a proposed inference as True, Probably True, Insufficient Data, Probably False, or False based on how far the passage supports it.

Passage: Last year, 1,200 candidates applied to a boutique consulting firm's graduate programme. The firm used the Watson Glaser as its first screen and invited 300 of them to a case interview.

  • Inference: "Three quarters of applicants were not invited to interview." With 300 of 1,200 invited, 900 were not, which is 75 percent. Rated True, because the numbers in the passage support it directly.
  • Inference: "The firm values critical thinking more than any other skill." Rated Insufficient Data. The passage never compares skills, so any answer here imports outside reasoning. The trap is treating a plausible real-world belief as if the passage proved it.

Recognition of Assumptions (Assumption Made / Not Made)

An assumption is something the statement takes for granted without saying it.

Statement: "We should require all consulting applicants to pass the Watson Glaser, because it predicts on-the-job performance."

  • Proposed assumption: "There is a relationship between Watson Glaser scores and job performance." Assumption Made. The argument collapses without it.
  • Proposed assumption: "The Watson Glaser is the only valid predictor of performance." Assumption Not Made. The word "only" is too strong; the statement never requires it. Absolutes like all, only, and never are the classic trap here.

Deduction (Follows / Does Not Follow)

A conclusion follows only if it must be true given the premises, with nothing added.

Premises: All candidates who scored above the 80th percentile were invited to a case interview. Maria was invited to a case interview.

  • Conclusion: "Maria scored above the 80th percentile." Does Not Follow. The premise says all high scorers were invited, not that only high scorers were invited; Maria could have been invited by another route. Assuming the converse is the most common deduction error.

Interpretation (Follows / Does Not Follow)

Here you judge whether a conclusion follows beyond reasonable doubt from a short passage.

Passage: In a pilot, candidates who completed three timed practice tests scored on average 12 percentage points higher than those who did none.

  • Conclusion: "Practising under timed conditions can improve Watson Glaser performance." Follows. The hedge word "can" stays within what the data shows.
  • Conclusion: "Anyone who does three practice tests will reach the 90th percentile." Does Not Follow. The words "anyone" and the specific percentile overreach well past an average improvement. Overreaching beyond the data is the interpretation trap.

Evaluation of Arguments (Strong / Weak)

A strong argument is both relevant to the question and important. A true statement can still be a weak argument.

Question: Should consulting firms keep using the Watson Glaser as a first-round screen?

  • Argument: "Yes, because it gives a standardised, objective measure of reasoning that is hard to fake on a CV." Strong. Directly relevant and important to the decision.
  • Argument: "No, because some candidates find tests stressful." Weak. It may be true, but it is trivial and applies to almost any selection method, so it does little to settle this specific question.

Notice the common thread across all five: every answer is judged on the wording and evidence in front of you, never on what you happen to know about the world. That discipline is also the backbone of a strong case-interview structure, where you reason from the prompt and the data rather than from assumptions.

The Answer Scale and How to Read It Without Overthinking

The scales trip people up more than the content does. Two patterns cover the whole test.

The Inference 5-point scale. Per iPrep, only the Inference section offers five choices: True, Probably True, Insufficient Data, Probably False, False. Read it as a confidence ladder. Choose True or False only when the passage makes the inference certain. Use Probably True or Probably False when the passage tilts one way but leaves room. Reach for Insufficient Data when answering would require a fact the passage simply does not give you. Most candidates underuse Insufficient Data; it is correct far more often than instinct suggests.

The 2-choice sections. The other four sections offer just two options each: Assumption Made or Not Made, Conclusion Follows or Does Not Follow, Argument Strong or Weak. The mistake here is overthinking. There is no partial credit and no shades of grey, so once you have judged the item strictly from the text, commit and move on. Lingering for a perfect rationale burns the seconds you need for the 12-question Recognition of Assumptions and Evaluation of Arguments sections.

A clean reading rule for the whole test: decide what the text supports, then pick the answer that says exactly that and nothing more.

Scoring and Percentiles: What to Aim For

Watson Glaser scoring is percentile-based, not pass or fail. Your raw score, the number correct out of 40, is converted into a percentile against a comparison group, and firms decide where to draw the line. Per The Lawyer Portal, the pass mark varies year to year, and CareerinLaw confirms firms compare candidates on a percentile basis with no fixed cut-off.

Here is the raw-score-to-percentile mapping reported by AssessFirst, with CareerinLaw's benchmarks added for context:

Raw score (out of 40)Approx. percentileWhat it signals
39 to 4095th to 99thExceptional; clears any firm threshold
36 to 3890thNear-guarantees advancement at top firms
33 to 3480thStrong; the common target for competitive firms
About 22 (55%)Around averageBelow most firm thresholds

CareerinLaw reports the average score is about 55 percent and that the 80th percentile or higher is considered good, with 90 percent or above near-guaranteeing advancement at top firms. The Lawyer Portal recommends aiming for 75 percent or more. Put together, a sensible target is the 75th to 80th percentile or above: roughly 33 correct out of 40 on a WG-III. Push for the 90th if you are chasing the most selective employers. Do not try to reverse-engineer a hidden cut-off, because it moves with each year's applicant pool.

How to Prepare and Pass

Preparation is mostly about method and pacing, not content cramming.

  1. Manage the clock. At about 45 seconds per question, you cannot afford to stall on any single item. Take a first-instinct judgment, flag genuinely hard ones if the platform allows, and keep moving. The two 12-question sections are where time quietly disappears.
  2. Ignore prior knowledge. This is the single biggest score lever. Answer only from the passage or premises in front of you. If a conclusion feels true because of what you know about the real world but the text does not support it, it is wrong.
  3. Learn weak versus strong arguments. In Evaluation of Arguments, a strong argument is relevant and important to the specific question. A true-but-trivial point is weak. Practise asking "does this actually bear on the decision?" rather than "is this statement accurate?"
  4. Watch for absolutes. Words like all, only, always, and never make a statement easy to disprove and are common in wrong answers across the deduction, assumption, and interpretation sections.
  5. Practise under realistic conditions. Do timed sets at your expected question count, then review every miss to label the trap you fell for. iPrep notes that practice familiarises you with item types, and the goal is to make the reasoning method automatic so the clock stops being the enemy.

Which Firms Require It, and How It Maps to a Consulting Funnel

Watson Glaser is most strongly associated with UK law. The Lawyer Portal lists dedicated firm guides for Clifford Chance, CMS, DLA Piper, Freshfields, and Hogan Lovells, and many other UK firms use it to screen training-contract and vacation-scheme applicants before interviews. If your application is to a law firm, expect it as an early gate.

For consulting and broader professional services, the picture is more mixed, and being honest about that is the point. Watson Glaser is not a universal MBB step. What is universal is the use of aptitude testing as a first screen, and the construct overlaps heavily. A candidate strong on Watson Glaser is, by definition, strong on the verbal critical-reasoning style that consulting and finance batteries test. If you want the broader landscape of what consulting actually screens with, read the aptitude tests for consulting guide, which maps numerical, verbal, and logical reasoning across firms.

The Big Four sit in the middle of this. Their advisory and consulting arms run their own aptitude and reasoning assessments, and the reasoning skills carry directly across. The PwC assessment test guide breaks down how one large professional-services firm structures its numerical, verbal, and situational screens, and the same "reason from the data, not from prior belief" discipline applies there as on Watson Glaser.

The practical mapping for a multi-track applicant:

  • Law track: Watson Glaser is likely your first screen. Treat it as a real gate.
  • Consulting track: Watson Glaser may appear at some firms, but expect verbal and numerical reasoning tests, then the case interview, as the dominant path.
  • Finance track: Critical-reasoning and numerical tests are common, with the same passage-bound logic.

Where Watson Glaser Sits Relative to the Case Interview

Think of Watson Glaser as one rung on a longer ladder. In a typical funnel it sits near the front: online application, then aptitude or critical-reasoning tests like Watson Glaser, then interviews. For consulting, the heaviest, most decisive stage is still the case interview, where you reason live, structure a problem, run the math, and defend a recommendation.

The good news is that the two stages reward the same core habit. Watson Glaser asks you to judge what evidence supports without overreaching. A case interview asks you to build a structure grounded in the prompt, test hypotheses against data, and avoid conclusions the numbers do not justify. Both punish the candidate who leans on assumption over evidence. Treating Watson Glaser prep as critical-thinking training, rather than a one-off hurdle, means the work compounds into your interview performance instead of being thrown away after the test.

That is also why generic question packs underdeliver. They drill recall of a fixed item set the WG-III item bank renders useless, and they do nothing for the live reasoning a case demands. Building the underlying judgment, the ability to structure a problem and reason strictly from evidence, is the transferable skill.

Sources

FAQ

Frequently asked questions