Is the CAT Out of the Bag?

This week we have been talking to veterinary colleagues about something that rarely comes up in item writing workshops but probably should: the quiet, insidious problem of test-wiseness cues. These are structural features of multiple choice questions that allow a candidate to identify the correct answer without actually knowing the answer. They are not cheating. They are not even intentional. They are simply the consequence of not following the basic rules of good item writing.

The idea is not new. Doug Holton, writing on his EdTechDev blog in 2007, demonstrated the problem brilliantly with a quiz written entirely in made-up language. No candidate could possibly know the content, because there was no content. And yet every question was answerable, purely by spotting the structural cues embedded in the item construction. We recommend looking at it. It is humbling.

So what are these cues?

Test-wiseness cues come in many forms. Grammatical mismatches between the stem and the options. One option that is conspicuously longer than the others. Word repetition between the stem and the correct answer. Options that are not on the same conceptual axis. Syllable count disparities. Precision mismatches between options. Each of these gives the game away. Each of them tells the test-wise candidate which answer to pick, regardless of their knowledge.

The CAT framework

To make it memorable for our veterinary educators we have developed the CAT framework. We make no apology for the mnemonic. Item quality can be considered through three lenses:

C - Context. The stem is not a story. It has a function in the testing point. The candidate must engage with it, interpret it, and use it to reach an answer. If they can ignore it and still get the question right, it has no value. Other than to depress them with endless words.

A - Axis. Every option must sit on the same conceptual plane. The moment one distractor differs in type, specificity, structure, or grammar, it becomes a cue. The candidate should have to know the answer, not spot the odd one out.

T - Testing point. What must the candidate be able to do? Not recall. Not recognise. Do. Define the behaviour or competence before you write a single word of the question. Testing points have become performative in name only. A real testing point demands a real behaviour: interpreting, deciding, prioritising, discriminating.

Now try it yourself

Below are four multiple choice questions. None of them are about real diseases or real treatments. The conditions are entirely made up. And yet you should be able to answer every single one correctly. See how many you get right, and then ask yourself how you knew.

Question 1 Which of the following are clinical signs of pirfloxemia in dogs?

a) acute zanthriavexin

b) chronic glimflexiamort

c) severe korfaxiabrendel

d) ornithexia and brenflaxis

Question 2 What is the optimal treatment for equine flundigosis?

a) Zordox

b) Gliffleplex

c) Morthaxine

d) Ennodexifal

Question 3 How should glimfloxaemia be treated in cats?

a) NSAIDs

b) Corticosteroids

c) Antihistamines

d) Azathioprine

Question 4 Which organism is responsible for causing M fever in cattle?

a) Florbexia zunthalis

b) Mandorkia pyrexiae

c) Zanthrobia glifflex

d) Brenflax korfalis

How did you do? The answers are not here. But if you spotted the cues, you already understand the problem. And if you did not, that is the point.

Test-wiseness is just one corner of a much larger landscape of item writing flaws. But it is a good place to start. And we had fun with it.

These cues are insidious

They do not arrive through malice. They arrive because item writers do not follow the simple rules of good question construction. Check the CAT before you sign off any item. Does the Context do useful work? Are the distractors on the same Axis? Is the Testing point a real behaviour?

If the answer to any of those is no, your question is giving something away. And the candidate it rewards is not necessarily the one who deserves it.

This is exactly why we built itemCrtQ. Not to review your questions two weeks later. Not to embarrass you in a question review meeting with the dean. itemCrtQ sits on your shoulder as you write, providing real time guidance at the moment it matters. Like a cat, it is always watching. Unlike a cat, it will not bite the hand that feeds it. Its teeth are velvet, yet sharp.

Sharper questions. Smarter exams.