Pathology Question Quality Review
Executive Summary
This review covers a random sample of 200 validated non-gold candidate questions drawn from the Pathology subject pool of 12,365 items. The sample was analyzed across eight shards of 25 questions each. The findings are consistent and mutually reinforcing across shards: the dominant quality problem in this subject is not a narrow cluster of broken items but a systemic structural deficit that affects the majority of the candidate pool.
The Blooms distribution of the candidate sample tells the story plainly: 71 questions at Blooms 1, 93 at Blooms 2, 18 at Blooms 3, 16 at Blooms 4, and 2 at Blooms 5. That means roughly 82% of sampled questions operate at recall or simple comprehension level. The benchmark standard — eight INI-CET questions, all at Blooms 3 or above, all embedded in rich clinical vignettes — represents a quality ceiling that the vast majority of candidate items do not approach.
Beyond the Blooms deficit, the reviewed set reveals a meaningful rate of factually incorrect answer keys, a recurring pattern of structurally defective option formats, a subset of questions with broken or unverifiable image dependencies, a cluster of items that belong to other subjects or are mislabelled within Pathology, and a layer of numerical trivia and low-yield factoids that add noise without discriminatory value.
The actionable summary: approximately 30–35% of the reviewed sample is suitable for keep or minor fix; approximately 40–45% requires substantive vignette-level rewriting to reach acceptable quality; and approximately 20–25% should be disabled outright because the concept is either trivially covered, factually unsafe, or so low-yield that rewriting would not be cost-effective given the depth of existing gold-standard coverage.
What Good Looks Like
The benchmark set provides a clear and consistent quality bar. Every benchmark question shares the following properties:
Rich clinical scaffolding. The patient is described with age, sex, duration of symptoms, relevant history, examination findings, and investigation results. The candidate must integrate this information before the question can even be answered. See the diabetic nephropathy question (98fcab91): 15 years of diabetes, bilateral oedema, frothy urine, BP 160/100, creatinine 2.8, albumin 2.2, 24-hour protein 5.5 g, biopsy showing nodular glomerulosclerosis — only then is the question asked. The clinical context is not decorative; it is load-bearing.
Reasoning demand at Blooms 3 or above. Benchmark questions do not ask what a finding is called. They ask what additional finding would be expected (98fcab91), what the pathogenesis is (58d7d7fe), what the underlying mechanism is (7519893b), which feature predicts malignancy (29ec786a), or what the immunofluorescence pattern would show (290aa049). The candidate must apply or analyse, not retrieve.
Distractors that are plausible to a prepared candidate. In the pheochromocytoma question (29ec786a), the distractors — nuclear pleomorphism, Zellballen pattern, S-100 in sustentacular cells — are all real features of the tumour. Only a candidate who understands that conventional cytological criteria do not predict malignancy in phaeochromocytoma will select capsular and vascular invasion. Weak distractors (implausible options, synonyms, "all of the above") are absent from the benchmark set.
Histopathological specificity. Benchmark questions name specific morphological findings — subepithelial humps, wire-loop lesions, spongiform changes, basaloid cells with peripheral palisading — and ask the candidate to reason from or about them. This is appropriate for a subject whose core competency is morphological diagnosis.
Appropriate difficulty calibration. Even questions flagged "easy" in the benchmark (e.g., 899d5bca on SLE renal pathology) require the candidate to connect a clinical presentation, a biopsy finding, and an immunological mechanism. They are easy because the reasoning chain is short, not because the question is decontextualised.
The PYQ set adds one further dimension: some high-quality questions are legitimately Blooms 1 or 2 when the factual target is genuinely high-yield and the distractors are meaningfully discriminating (e.g., 9f700db4 on CLL cell of origin, f50d3086 on pyroptosis). The issue in the candidate pool is not that Blooms 1–2 questions exist; it is that the majority of them test trivial or poorly framed facts with implausible distractors and no clinical anchoring.
Main Issue Categories
1. Decontextualised Recall: The Dominant Structural Deficit
Why this pattern is bad
Indian PG entrance examinations — INI-CET in particular — have moved decisively toward clinical vignette-based questions that require application and analysis. A question that asks "most common cause of cell injury" or "which stain is used for fungus" tests whether a candidate has read a textbook, not whether they can reason about a patient. These items also have poor psychometric properties: they are susceptible to rote memorisation, they do not discriminate between candidates who understand a concept and those who have merely memorised a phrase, and they are trivially defeated by any candidate who has seen the question before.
The benchmark standard makes the contrast explicit. Every benchmark question embeds the same factual targets — amyloid typing, glomerulonephritis patterns, pituitary tumour histology — inside a clinical scenario. The candidate who knows that "AL amyloid causes cardiac disease" will fail the benchmark question if they cannot also reason about which clinical and histological features distinguish AL from ATTR amyloidosis in context.
How it shows up
This is the single most prevalent pattern in the reviewed set, appearing across every shard and every topic area. It is not a narrow cluster; it is the default mode of question construction in this pool. Approximately 60–65% of the candidate sample falls into this category. The questions share a recognisable structure: a one- or two-sentence stem naming a disease or finding, followed by four options of which one is a memorised association. No patient. No reasoning chain. No integration.
The pattern is particularly dense in General Pathology (cell injury, inflammation, repair), Hematopathology (peripheral smear associations, lymphoma markers), Neoplasia (tumour markers, IHC associations), and Neuropathology (inclusion body associations, tumour features).
Example question IDs
- Q43964cfa — "Most common cause of cell injury is hypoxia." Single bare fact, no vignette, Blooms 1. The concept is important; the question format is not.
- Qb8d5faff — "What process does cytosolic Cytochrome C mediate?" Single-step association, no clinical context.
- Q18020b2b — "Congo red birefringence colour." Pure Blooms 1 recall; distractors (silver, golden, blue) are implausible to any prepared candidate.
- Qe755cfac — "Brown tumour → hyperparathyroidism." Single association, Blooms 1, distractors are implausible.
- Qfdd4b465 — "Best source of Factor VIII." No clinical framing, no reasoning required.
- Q5d9dec55 — "Shock lung = ARDS." Glossary-level association, Blooms 1.
- Q3e7b798f — "BCR-ABL → CML." Blooms 1, already well-covered by PYQs in the benchmark set.
- Q17b05613 — "What does cytopathology deal with?" Definitional recall with no discriminatory value at PG level.
- Q79e37214 — "Stain used in electron microscopy." Pure rote recall.
- Q418c69d5 — "Birbeck granules seen in?" Blooms 1, flagged easy, no vignette.
- Qffea18a1 — "Schaumann bodies seen in?" Same pattern.
- Q88a55fb9 — "Adult PCKD inheritance?" Same pattern.
- Q76a98105 — "Hereditary spherocytosis transmitted as?" Same pattern.
- Q3a22b456 — "PSA = tumour marker." Single sentence, trivially obvious.
- Q74669bc9 — "Waldenström = lymphoplasmacytic lymphoma." Bare association.
- Q65966351 — "Bloom-Richardson includes mitotic rate." Immediately obvious to any candidate who has read the topic once.
Recommended disposition
For items in this category, the decision between fix and disable should be driven by two factors: (a) whether the underlying concept is genuinely high-yield for INI-CET/NEET-PG, and (b) whether a vignette-level rewrite would produce a question meaningfully different from existing gold-standard coverage. Where the concept is high-yield and no good vignette version exists in the gold set, fix by embedding in a clinical scenario. Where the concept is already well-covered by PYQs or benchmark items, or where the fact is too trivial to justify a full vignette, disable. The majority of items in this category should be disabled rather than rewritten, because the rewrite effort is high and the conceptual territory is already covered at higher quality elsewhere in the pool.
2. Factually Incorrect or Unsafe Answer Keys
Why this pattern is bad
A question with a wrong correct answer is worse than no question at all. It actively miseducates candidates, erodes trust in the platform, and creates medico-legal risk if the misinformation influences clinical reasoning. This is the highest-priority quality failure in the reviewed set, and it requires a dedicated expert review pass rather than a routine editorial fix.
The rate of factual errors observed in this sample is concerning. Across eight shards, reviewers identified at least ten questions with demonstrably incorrect or seriously contestable correct answers. Extrapolating conservatively to the full pool of 12,365 questions, this represents a meaningful accuracy risk.
How it shows up
The errors cluster in three sub-types:
Outright factual inversion: The correct answer is simply wrong by standard reference. These are the most dangerous items.
Outdated answers: The correct answer was once accepted but has been superseded by current evidence or classification systems.
Ambiguous keys without sufficient clinical context to disambiguate: The answer is correct in one clinical scenario but wrong in another, and the question provides no context to specify which scenario applies.
Example question IDs
- Qbb5c26e6 — p-ANCA specificity marked as Wegener's granulomatosis. This is a clear factual inversion: p-ANCA (MPO-ANCA) is associated with microscopic polyangiitis and Churg-Strauss syndrome; Wegener's granulomatosis is c-ANCA (PR3-ANCA). A candidate who learns from this question will be wrong on the exam and potentially in clinical practice.
- Q143a73c3 — "Wound healing is a summation of fibrolysis." Wound healing is classically described as a summation of regeneration and repair/fibroplasia. "Fibrolysis" is not a standard term in this context and the answer is factually wrong.
- Qb13605b4 — "Small cell lung carcinoma" and "oat cell carcinoma" listed as separate options. These are synonymous terms. The question is internally incoherent; any candidate who knows this will be unable to answer.
- Q87402786 — "Lung" keyed as the site with least malignant potential for carcinoid tumours. The standard teaching is that appendiceal carcinoids have the least malignant potential; lung carcinoids are intermediate. This is a factual error.
- Q8c27b657 — Fibrinoid necrosis marked as characteristic of sarcoidosis. Sarcoidosis is characterised by non-caseating granulomas without fibrinoid necrosis. Fibrinoid necrosis is the hallmark of PAN, malignant hypertension, and SLE vasculitis.
- Q71bca3a5 — Pseudomyxoma peritonei attributed to mucinous cystadenocarcinoma of the ovary. Current WHO classification and molecular evidence identify low-grade appendiceal mucinous neoplasm (LAMN) as the primary source; ovarian involvement is secondary.
- Q7bc57e72 — Fibrolamellar carcinoma described as the "immature variant" of HCC. Fibrolamellar is a well-differentiated, AFP-negative variant occurring in young adults without cirrhosis — the opposite of immature.
- Q68986139 — Anti-ribonucleoprotein listed as the most common antibody in Sjögren syndrome. Anti-SSA (Ro) is the most common antibody in Sjögren syndrome (~70–75%); anti-RNP is the hallmark of mixed connective tissue disease.
- Qc9af7b58 — Kell listed as the most important blood group system in transfusion. ABO is the most important system; Rh is second; Kell is third.
- Q4bbb1481 — "Increased LAP score" marked as the false statement about polycythaemia vera. LAP score is in fact increased in PV; it is decreased in CML. The answer key is internally contradicted.
- Qaa37f480 — Miliary tuberculosis listed as a cause of eosinophilia (correct answer), while Hodgkin's disease and filariasis — both well-established causes of eosinophilia — are listed as distractors. The answer appears factually inverted.
- Q91c8fc0b — Most common lymphoma in HIV. The answer is contested in current literature (immunoblastic vs. diffuse large B-cell vs. Burkitt's), making this factually ambiguous at Blooms 1.
Recommended disposition
All items in this category require expert SME review before any other action. Items with clear factual inversions (Qbb5c26e6, Q143a73c3, Q87402786, Q8c27b657, Qc9af7b58, Q4bbb1481) should be disabled immediately pending correction. Items with outdated answers (Q71bca3a5, Q7bc57e72) should be fixed with updated answer keys and, where possible, upgraded to vignette format. Items with ambiguous keys (Q7b0d8e5c on cardiac amyloid fibril type) should be fixed by adding clinical context that disambiguates the intended scenario. No item in this category should be deployed in its current state.
3. Structurally Defective Option Formats
Why this pattern is bad
"All of the above" and "none of the above" as correct answers are well-documented item-writing failures. They allow test-wise candidates to select the correct answer through elimination logic rather than content knowledge. A candidate who is uncertain about two options but confident that one is correct can select "all of the above" without knowing whether the remaining options are correct. This destroys the discriminatory function of the item. The same logic applies to questions where synonymous terms appear as separate options, or where a distractor is logically incompatible with the stem format.
These formats are absent from the benchmark set and from recent PYQs. Their presence in the candidate pool is a systematic quality failure, not an isolated error.
How it shows up
The pattern appears across multiple shards and topic areas. "All of the above" as the correct answer is the most common variant. "None of the above" as the correct answer is less frequent but equally problematic. A related sub-type — listing synonymous terms as separate options — appears in at least one question and creates a different but equally serious problem: the question becomes unanswerable for a knowledgeable candidate.
- Q2972ddbe — "All of the above" as correct answer for gastric carcinoma predisposing factors.
- Q6c97e273 — "None of the above" as correct answer for lung carcinoma facts.
- Q2b794432 — "All of the above" as correct answer for S100 marker associations.
- Q83acbe7b — "All the above" as correct answer for fibrinoid necrosis conditions.
- Qcd99c359 — "All of the above" as correct answer for CSF-spreading CNS tumours.
- Qd7ed4041 — "All of the above" as correct answer for benign tumours, including atheroma and granuloma, which are not true neoplasms.
- Qdae93801 — "All of the above are true" as an option in a NOT-true stem. Logically incompatible with the stem format.
- Qb13605b4 — "Small cell lung carcinoma" and "oat cell carcinoma" as separate options (synonyms).
Recommended disposition
Fix all items in this category. The fix path is consistent: replace "all of the above" with a specific, plausible distractor that a prepared candidate might genuinely consider; replace "none of the above" by converting to a positive-stem question; replace synonymous options with genuinely distinct alternatives. Where the underlying concept is high-yield, the fix is worth doing. Where the concept is also low-Bloom and already covered elsewhere, disable instead of fixing.
4. Blooms Level Miscoding
Why this pattern is bad
Blooms level assignments drive how questions are selected for tests, how difficulty is calibrated, and how the pool's cognitive distribution is reported. When a Blooms 1 recall item is tagged Blooms 4, it inflates the apparent quality of the pool, causes test templates to be populated with items that appear to be higher-order but are not, and makes it impossible to accurately audit the pool's cognitive distribution. The candidate sample's reported distribution (71 at Blooms 1, 93 at Blooms 2, 18 at Blooms 3, 16 at Blooms 4, 2 at Blooms 5) almost certainly understates the true proportion of Blooms 1 items because many Blooms 2–4 tagged items are functionally Blooms 1.
How it shows up
The miscoding runs in one direction: items are tagged higher than their actual cognitive demand. A question asking "which cell type expresses CD15 and CD30?" is Blooms 1 regardless of whether it is tagged Blooms 4. A question asking "what is the definition of regeneration?" is Blooms 1 regardless of a Blooms 5 tag. The pattern is particularly prevalent in Hematopathology, Neoplasia, and Neuropathology topics.
- Q330cde03 — CD15/CD30 for Hodgkin's lymphoma tagged Blooms 4. This is a pure recall association (Blooms 1).
- Q3be3304c — Regeneration definition tagged Blooms 5. This is a definitional recall item (Blooms 1).
- Q73f36bba — Burkitt lymphoma/starry sky pattern tagged Blooms 5. Straightforward recall (Blooms 1).
- Q4bbb1481 — PV false statement tagged Blooms 4. Factual exclusion question (Blooms 1–2).
- Q083c8baa — Cholecystoses definition tagged Blooms 4. Definitional recall (Blooms 1).
- Q501022ac — Calcitriol mechanism tagged as Pathology/Blooms 3. This is a Biochemistry/Physiology item with no pathological framing.
Recommended disposition
Blooms recoding is a metadata fix that should be applied systematically across the full Pathology pool, not just the reviewed sample. For items where the recoded Blooms level is 1 and the question is otherwise structurally sound, the item may be retained as a low-level recall item if the concept is genuinely high-yield. For items where the recoded level is 1 and the question is also decontextualised and low-yield, apply the disable logic from Category 1. The Blooms audit should be treated as a prerequisite for any test-template calibration work.
5. Factual Accuracy Failures in Distractor Construction
Why this pattern is bad
This category is distinct from Category 2 (wrong correct answer) and from Category 3 (structural format failures). Here the correct answer is right, but the distractors are either implausible, internally inconsistent with the stem, or — in the most serious sub-type — factually correct statements that have been incorrectly marked wrong. When a distractor is a true statement that has been marked incorrect, the question has multiple defensible answers, which is a validity failure. When distractors are obviously implausible, the question loses discriminatory power because any prepared candidate can eliminate them immediately.
How it shows up
Implausible distractors: Options that no prepared candidate would seriously consider. In Q18020b2b (Congo red birefringence), the distractors are "silver," "golden," and "blue" — none of which is a plausible alternative to apple-green. In Qe755cfac (brown tumour), the distractors are hypo/hypothyroidism and hyperthyroidism — implausible to any candidate who has read the topic.
True statements marked wrong: In Q6f27784f (antigen-presenting cell), macrophage is listed as a wrong option despite being a classical professional APC. In Q842f02f5 (cell injury mechanisms), cytosolic calcium rise and membrane damage are listed as wrong options despite being genuine mechanisms of cell injury — the question lacks a qualifier ("earliest" or "most fundamental") that would make one answer unambiguously correct.
Logically incompatible distractors: In Qdae93801 (pilocytic astrocytoma NOT true), option D reads "All of the above are true" — logically incompatible with a NOT-true stem.
Syndrome conflation in distractors: In Q4cad19be (Turcot's syndrome), features of Gardner's syndrome appear as distractors without adequate clinical context to distinguish the two polyposis syndromes, creating ambiguity rather than discrimination.
Example question IDs
- Q6f27784f — Macrophage incorrectly marked wrong as an APC.
- Q842f02f5 — Cell injury mechanisms: stem lacks qualifier making multiple options defensible.
- Qdae93801 — "All of the above are true" in a NOT-true stem.
- Q4cad19be — Turcot's/Gardner's syndrome feature conflation.
- Q24032581 — MALT lymphoma organism associations: Bartonella henselae listed as the exception for skin MALT, but the correct exception should be Borrelia burgdorferi; Bartonella causes bacillary angiomatosis.
- Q43407734 — Small cell lung carcinoma paraneoplastic syndromes: "either hyponatremia or diffuse pigmentation" framing creates ambiguity about which paraneoplastic syndrome is being asked about.
Recommended disposition
Fix all items in this category. The fix path varies by sub-type: replace implausible distractors with clinically meaningful alternatives; add qualifiers to stems where multiple options are defensible; remove logically incompatible options; verify syndrome-feature mappings against current classification. Items where the distractor problem is entangled with a factual error in the correct answer should be escalated to Category 2 handling (SME review before deployment).
6. Image-Dependent Questions with Unverifiable Image Integrity
Why this pattern is bad
Image-based questions are among the highest-value items in a Pathology question bank. Morphological diagnosis from H&E sections, peripheral smears, electron micrographs, and gross specimens is a core competency for INI-CET and NEET-PG. However, an image-based question without a functioning image is not merely a low-quality question — it is a broken question that cannot be answered at all. Candidates who encounter it will either guess randomly or report it, both of which damage platform trust. The question's value is entirely contingent on image integrity.
How it shows up
Several questions in the reviewed set are explicitly image-dependent (they reference a figure, photograph, or micrograph as part of the stem) but the image cannot be verified from the text representation. Some of these are PYQ-tagged items of genuine high value; others are lower-quality items where the image dependency adds risk without proportionate benefit.
- Q0434dd4b — Cardiac pathology, image-based PYQ (AIIMS-2015), HCM in young athlete with family history. Blooms 4, clinically rich, high template membership. High value if image is intact; non-functional if image is missing.
- Q7c2b0e07 — Hematopathology, image-based PYQ (NEET-PG-2024), Gaucher's disease. High template membership, Blooms 4. Same dependency.
- Qd83dece7 — MI aging from H&E image, PYQ-tagged NEET-PG. Blooms 4, tests applied morphological reasoning. Keep if image is verified.
- Qee86f39e — Councilman body in hepatitis biopsy, PYQ-tagged NEET-PG 2017. Blooms 4, clinically framed. Keep if image is verified.
Recommended disposition
All image-dependent questions require a dedicated image integrity verification pass before deployment. Questions where the image is confirmed present and of adequate resolution should be kept; these are among the highest-value items in the pool. Questions where the image is missing or of inadequate quality should be disabled until the image can be sourced and attached. This is not a content fix — it is a deployment readiness check that must precede any test-template inclusion of these items.
7. Numerical Trivia and Low-Yield Factoids
Why this pattern is bad
A subset of questions in the reviewed set tests specific numerical values or obscure associations that have no clinical reasoning value and are not represented in recent PYQs or benchmark items. These questions do not discriminate between candidates who understand pathology and those who have memorised a single number from a single textbook. They are also frequently source-dependent: the "correct" figure varies across standard references, making the answer contestable. They add noise to the pool without contributing to its discriminatory validity.
This category is distinct from Category 1 (decontextualised recall of high-yield facts) because the facts being tested here are genuinely low-yield — they would not appear in a well-constructed exam regardless of how the question was formatted.
How it shows up
- Q23994227 — "Blood smear best visualised at pH 6.8." A laboratory technical detail with no clinical reasoning value. Not represented in any PYQ or benchmark.
- Q136d9263 — "Dermoid cyst bilaterality ~10%." The cited figure varies across sources (8–15%). No clinical application.
- Qae68faec — "Which causes apoptotic damage to sperms — Chlamydia?" Obscure, poorly sourced factoid with no relevance to standard PG pathology curriculum.
- Q9a9e483d — "Which is not a carcinogenic virus?" Molluscum contagiosum as the correct answer is straightforward elimination with no clinical reasoning required.
- Qba046d6d — "Most common trisomy — trisomy 16." Blooms 1, no clinical context, trivial recall.
Recommended disposition
Disable all items in this category. The rewrite cost is not justified because the underlying facts are not worth testing at PG level. If the topic area (e.g., ovarian tumour bilaterality, viral carcinogenesis) is worth covering, it should be covered through a new question that tests clinical reasoning about the topic rather than a memorised number or list.
8. Subject Boundary Drift and Topic Mislabelling
Why this pattern is bad
Questions from Biochemistry, Physiology, Embryology, and Genetics are appearing in the Pathology bank without pathological framing. This dilutes subject coherence, distorts topic-level coverage metrics, and creates confusion for candidates studying by subject. Separately, questions are mislabelled within Pathology — a salivary gland tumour filed under Endocrine Pathology, a chromosomal syndrome filed under Bone and Soft Tissue Pathology — which affects topic-level quality audits and test-template construction.
How it shows up
Wrong subject entirely:
- Q501022ac — Calcitriol mechanism of action. This is a Biochemistry/Physiology question with no pathological framing. It should not be in the Pathology bank.
- Qfaed81c4 — Malformation vs. disruption vs. deformation. This is a teratology/embryology concept more appropriate to Anatomy or Genetics.
Mislabelled within Pathology:
- Qdafd0fe5 — Acinic cell tumour filed under "Endocrine Pathology." Should be "Head and Neck Pathology" or "Salivary Gland Pathology."
- Qc4a2861e — Klinefelter syndrome filed under "Bone and Soft Tissue Pathology." Should be "Molecular/Reproductive Pathology."
- Q24032581 — MALT lymphoma filed under "Neoplasia" rather than "Hematopathology."
Recommended disposition
Items that belong to another subject entirely should be disabled from the Pathology bank and, if the question is otherwise of acceptable quality, transferred to the appropriate subject. Items that are mislabelled within Pathology should have their topic tags corrected; this is a metadata fix that does not require content changes. A systematic topic-label audit is recommended for the full Pathology pool, with particular attention to head and neck tumours, lymphoma questions, and genetic/syndromic conditions.
Prioritization
The eight issue categories identified above are not equally urgent. The following priority ordering reflects both the severity of the quality failure and the operational effort required to address it.
Immediate action required (before any deployment)
Category 2 — Factually Incorrect or Unsafe Answer Keys. Questions with wrong correct answers must be identified, reviewed by a subject-matter expert, and either corrected or disabled before they appear in any test. This is the only category that poses active harm to candidates. The items identified in this review (Qbb5c26e6, Q143a73c3, Q87402786, Q8c27b657, Qc9af7b58, Q4bbb1481, Qaa37f480, Q7bc57e72, Q68986139, Q71bca3a5, Qb13605b4) should be disabled immediately pending SME review. A broader accuracy audit of the full pool — particularly Immunopathology, Hematopathology, Liver Pathology, and Inflammation topics — is warranted.
Category 6 — Image-Dependent Questions with Unverifiable Image Integrity. Image-based questions that cannot be answered without a functioning image must be verified before deployment. This is a binary readiness check: verified and intact = keep; missing or degraded = disable until resolved.
High priority (fix queue, current sprint)
Category 3 — Structurally Defective Option Formats. "All of the above," "none of the above," and synonym-as-distractor items are identifiable by pattern matching and can be fixed systematically. The fix path is well-defined. This category is high-priority because these items actively undermine test validity even when the underlying content is correct.
Category 5 — Factual Accuracy Failures in Distractor Construction. Items where true statements are marked wrong, or where distractors are logically incompatible with the stem, require content review and targeted fixes. These are less dangerous than Category 2 (the correct answer is right) but still reduce validity.
Medium priority (fix or disable queue, next sprint)
Category 1 — Decontextualised Recall. This is the largest category by volume and the most resource-intensive to address. The recommended approach is triage: identify the subset of Blooms 1–2 items covering genuinely high-yield concepts not already covered by gold-standard questions, and prioritise those for vignette conversion. The remainder should be disabled. Given the scale (approximately 60–65% of the candidate pool), this work should be phased over multiple sprints.
Category 4 — Blooms Level Miscoding. A metadata audit pass that can be run in parallel with content work. Recoding Blooms levels correctly is a prerequisite for accurate pool-level reporting and test-template calibration.
Lower priority (audit and clean-up)
Category 7 — Numerical Trivia and Low-Yield Factoids. These items are easy to identify and should be disabled. The volume is relatively small and the disable decision is straightforward.
Category 8 — Subject Boundary Drift and Topic Mislabelling. Metadata corrections and cross-subject transfers. Low urgency for candidate-facing quality but important for pool integrity and coverage reporting.
Example Keep / Fix / Disable Calls
The following calls are drawn directly from the reviewed sample and are intended to illustrate the application of the issue categories above to specific items.
Keep
| Question ID | Rationale |
|---|---|
| Q2596b93d | HCV-associated MPGN vignette with decreased complement, subnephrotic proteinuria, hematuria. Blooms 4, clinically integrated, appropriate distractors. Meets benchmark standard. |
| Qa79a37cd | Sjögren's syndrome mechanism with anti-Ro/SSB antibodies and clinical presentation. PYQ-tagged (NEET-PG 2025), Blooms 4, well-constructed distractors. |
| Qa310e66d | Sheehan syndrome vignette (postpartum haemorrhage → pituitary infarction). Blooms 4, integrates obstetric history with endocrine pathology. Matches benchmark quality. |
| Qaf5fcdb1 | Myositis ossificans mimics osteosarcoma histologically. Blooms 4, NEET-PG tagged, tests a genuine diagnostic pitfall with real clinical consequence. |
| Qc98d2a39 | Pelvic mass with bluish-white cut surface and calcifications in a 51-year-old. Correctly identifies chondrosarcoma. Blooms 4, hard difficulty, good distractor set. |
| Q4d471f25 | Frontal-temporal atrophy with dementia; asks for most useful microscopic feature. Correctly targets Pick bodies. Blooms 3, well-constructed distractors. |
| Q0434dd4b | Cardiac pathology, image-based PYQ (AIIMS-2015), HCM in young athlete. Blooms 4. Keep if image verified intact. |
| Q7c2b0e07 | Hematopathology, image-based PYQ (NEET-PG-2024), Gaucher's disease. Blooms 4. Keep if image verified intact. |
| Qe8e4755e | "Histological grade best correlates with prognosis in which malignancy?" PYQ-tagged (UPSC-CMS 2023), Blooms 4, tests genuine comparative reasoning. |
| Q2d4fcd43 | MGUS vignette (80-year-old, IgG 1.5 g/dL, 8% plasma cells). Blooms 3, requires application of diagnostic criteria. |
| Q9584c6e1 | Post-surgical C. difficile colitis vignette with antibiotic exposure and colonoscopic finding. Blooms 3, plausible distractors. |
Fix
| Question ID | Issue | Fix Action |
|---|---|---|
| Qbb5c26e6 | p-ANCA keyed to Wegener's — factual inversion | Correct answer to microscopic polyangiitis; revise distractors |
| Q143a73c3 | "Fibrolysis" as wound healing summation — factually wrong | Complete rewrite with accurate answer options |
| Q87402786 | "Lung" keyed as least malignant carcinoid site — factual error | Change correct answer to "Appendix"; revise distractors |
| Q8c27b657 | Fibrinoid necrosis keyed to sarcoidosis — factual error | Correct answer to PAN or SLE; revise distractors |
| Q7bc57e72 | Fibrolamellar HCC described as "immature variant" — factually wrong | Rewrite stem: "Which variant of HCC occurs in young adults without cirrhosis?" |
| Q68986139 | Anti-RNP keyed as most common antibody in Sjögren's — factual error | Change correct answer to Anti-SSA (Ro); revise distractor list |
| Q2972ddbe | "All of the above" as correct answer | Replace with specific distractor; optionally convert to vignette |
| Q2b794432 | "All of the above" as correct answer for S100 | Restructure as clinical vignette with single best answer |
| Qdae93801 | "All of the above are true" in a NOT-true stem | Remove logically incompatible option; add plausible distractor; add brief vignette |
| Qb13605b4 | "Small cell" and "oat cell" as separate options | Replace one synonym with a genuinely distinct distractor; add clinical vignette |
| Q6f27784f | Macrophage incorrectly marked wrong as APC | Reframe to ask specifically about epidermal APC; macrophage becomes a plausible but incorrect distractor in that context |
| Q4bbb1481 | "Increased LAP score" marked false for PV — factual error in key | Correct answer key; verify all other options against standard references |
| Q71bca3a5 | Pseudomyxoma peritonei attributed to ovarian mucinous cystadenocarcinoma — outdated | Update correct answer to low-grade appendiceal mucinous neoplasm (LAMN) |
| Qa125f518 | Pulmonary embolism as bare recall | Embed in clinical vignette (post-surgical patient with sudden cardiovascular collapse) to reach Blooms 3 |
| Q3019d3b9 | Smoking-related ILD vignette — good structure but distractors could be stronger | Strengthen distractors; keep clinical framing |
| Q43407734 | SCLC paraneoplastic "either/or" framing creates ambiguity | Split into two focused questions or specify one paraneoplastic syndrome per stem |
Disable
| Question ID | Rationale |
|---|---|
| Q43964cfa | "Most common cause of cell injury is hypoxia." Bare Blooms 1, no vignette, below quality floor. |
| Qb8d5faff | "Cytosolic Cytochrome C mediates?" Single-fact recall, no vignette. Concept covered better elsewhere. |
| Q86faba36 | Stem asks for cell type; correct answer is a cytokine (IFN-γ). Internally incoherent, unanswerable. |
| Q5d9dec55 | "Shock lung = ARDS." Glossary-level Blooms 1, no clinical value. |
| Qeb62e001 | "OI = Collagen type I defect." Blooms 1, no vignette. |
| Q3e7b798f | "BCR-ABL → CML." Blooms 1, already covered by PYQs at higher quality. |
| Q17333564 | "Rokitansky-Aschoff sinuses → chronic cholecystitis." Blooms 1, single association. |
| Qdd8bead8 | "Kuru plaques → CJD." Blooms 1; benchmark already has a superior CJD question (58d7d7fe). |
| Q91c8fc0b | Most common lymphoma in HIV — factually contested plus Blooms 1. |
| Q23994227 | Blood smear pH 6.8. Numerical trivia, no clinical value, not in any PYQ. |
| Q136d9263 | Dermoid cyst bilaterality ~10%. Numerical trivia, source-dependent, no clinical application. |
| Qae68faec | Chlamydia apoptotic damage to sperms. Obscure factoid, poor distractor quality, not in PG syllabus. |
| Q501022ac | Calcitriol mechanism. Biochemistry/Physiology question; does not belong in Pathology bank. |
| Q5a1bfbd1 | "Overgrowth of bile duct in a localized region." Bare terminology recall, Blooms 1. |
| Q418c69d5 | "Birbeck granules seen in?" Blooms 1, flagged easy, no vignette. Concept worth testing only in vignette format. |
| Qffea18a1 | "Schaumann bodies seen in?" Same pattern as above. |
| Q88a55fb9 | "Adult PCKD inheritance?" Blooms 1, flagged easy. |
| Q76a98105 | "Hereditary spherocytosis transmitted as?" Blooms 1, flagged easy. |
| Q3a22b456 | "PSA = tumour marker." Single sentence, trivially obvious, no discriminatory value. |
| Q74669bc9 | "Waldenström = lymphoplasmacytic lymphoma." Bare association, Blooms 1. |
| Q65966351 | "Bloom-Richardson includes mitotic rate." Immediately obvious, no discriminatory value. |
| Qecb26966 | Compound odontoma. Dental pathology niche item, marginal syllabus relevance for INI-CET/NEET-PG. |
| Qba046d6d | "Most common trisomy — trisomy 16." Blooms 1, no clinical context. |
| Q34d9ed96 | "Role of thick mucous coat in GI tract." Vague stem, obvious answer, no discriminatory value. |
| Q39c6fa79 | "Least ability to regenerate → striated muscle." Conflates skeletal and cardiac muscle; factually imprecise. |