Final Production Synthesis
Executive Summary
This synthesis covers 21 subject reports drawn from a combined bank of 169,690 questions, with quality judgments grounded in a candidate sample of 4,000 randomly selected validated non-gold questions. One subject ("Other") returned an empty sample and contributes no findings.
The overarching finding is structural, not incidental: the majority of the bank is misaligned with what INICET and NEET-PG actually test. Across nearly every subject, the candidate pool is dominated by single-fact recall items at Bloom's 1–2, while the benchmark and PYQ gold standards operate at Bloom's 3–4 with clinical vignettes, multi-step reasoning, and competitive distractors. This is not a handful of bad questions — it is the default mode of the bank.
Layered on top of this structural gap are five additional problem types that appear repeatedly across subjects: factually incorrect answer keys, broken image-dependent questions, structural MCQ construction failures, topic misclassification, and near-duplicate content clustering. Each requires a different remediation path and a different urgency level.
Rough disposition estimate across the 4,000-question candidate sample:
| Disposition | Estimated Count |
|---|---|
| Keep as-is or minor polish | ~900–1,100 |
| Fix (vignette upgrade, key correction, structural repair, reclassification) | ~1,200–1,500 |
| Disable | ~1,400–1,800 |
The disable rate is high but justified. The bank is large enough that removing low-quality items will not create coverage gaps; the risk of leaving them in is greater than the risk of removing them.
Broad Issue Categories Across The Bank
The subject reports independently converge on six distinct issue categories. These are not imposed from outside — they emerge from the evidence across subjects. Each has a different cause, a different harm profile, and a different remediation path.
Category A: Recall Overload — The Dominant Structural Problem
What it is. The single most pervasive finding across all 20 reviewed subjects is an extreme overrepresentation of Bloom's 1–2 recall items: single-fact lookups, bare definitions, eponym-to-condition mappings, isolated numerical thresholds, and drug-of-choice questions with no clinical context. These items test whether a candidate has memorized a fact, not whether they can use it.
Scale. Bloom's 1 proportions in the candidate samples range from roughly 24% (Obstetrics & Gynecology) to 56% (Anatomy). When Bloom's 2 items that function cognitively as Bloom's 1 are included — a pattern explicitly flagged in Anatomy, Anesthesiology, Biochemistry, Community Medicine, ENT, Forensic Medicine, Internal Medicine, Microbiology, Pharmacology, Physiology, Psychiatry, and Surgery — the effective recall proportion in most subjects exceeds 70–80% of the candidate pool. The benchmark and PYQ gold standards are predominantly Bloom's 3–4.
Where it is most visible. Physiology (93 of 200 candidate questions at Bloom's 1), Anatomy (113 of 200), Forensic Medicine (93 of 200), Microbiology (89 of 200), and Pharmacology (80 of 200) show the most severe imbalance. The problem is present in every subject but is least severe in General Medicine and Internal Medicine, where the Bloom's distribution is somewhat better — though both subjects have their own structural problems (missing images in General Medicine; thin vignettes in Internal Medicine).
The "thin vignette" variant. Multiple reports (Anatomy, Anesthesiology, Community Medicine, Dermatology, ENT, Internal Medicine, Obstetrics & Gynecology, Ophthalmology, Pediatrics, Psychiatry, Surgery) identify a secondary form: questions that have a clinical scenario in the stem but where the scenario is decorative. The clinical details do not change which answer is correct; the answer is immediately deducible from a single keyword. These items carry inflated Bloom's labels (often Bloom's 3 or 4) and pass surface-level quality checks, but they function as recall items. This is operationally important because they corrupt Bloom's distribution metrics and mislead test assembly.
Remediation split. Not all recall items should be treated the same way. Items testing genuinely high-yield, PYQ-validated facts (e.g., specific drug mechanisms, key diagnostic criteria) are worth converting to clinical vignettes. Items testing low-yield trivia, obsolete eponyms, historical attributions, or facts deducible from the question stem itself should be disabled without replacement. The reports consistently recommend: fix high-yield concepts, disable low-yield trivia.
Category B: Factually Incorrect or Unsafe Answer Keys
What it is. A recurring cluster of questions across subjects carries demonstrably wrong correct answers. These range from outright factual inversions (the keyed answer states the opposite of established teaching) to outdated management answers (the keyed answer was once correct but has been superseded by current guidelines) to internally contradictory option sets (where a distractor is also factually correct, creating two defensible answers).
Why this is the highest-severity category. Unlike low-quality recall items, which merely fail to discriminate, wrong-keyed items actively harm candidates by reinforcing incorrect knowledge. In clinical subjects, this has patient safety implications at one remove. This category requires expert subject-matter review before any deployment — it cannot be resolved by editorial staff alone.
Where it appears. Confirmed or strongly suspected answer key errors were identified in every subject where clinical content is tested. The highest densities were observed in:
- Anatomy: nerve root/dermatome errors (Q-84d81034, e4c5bed1), vascular anatomy errors (9616532b), abdominal anatomy inversions (19efdd36, 0ff7d348)
- Anesthesiology: bupivacaine toxicity treatment (da0f6ddd — 5% dextrose instead of 20% lipid emulsion), ventilator mechanics (0c76505f)
- Biochemistry: glycolysis ATP yield (f073adfe — keyed as 7 instead of 2), iron absorption (762bd09e — ascorbic acid marked as decreasing absorption)
- Community Medicine: HDI components (71053b22 — life expectancy marked as NOT included), OPV classification (e70433b8 — marked as not live attenuated)
- Forensic Medicine: ethylene glycol antidote (0ebdb4c9 — fluconazole instead of fomepizole), brain death criteria (0fdbebd6)
- Internal Medicine: stroke thrombolysis window (981af093 — 3 hours instead of 4.5 hours), acoustic neuroma nerve (51534038 — CN VII instead of CN VIII)
- Microbiology: HAV family (dac5b285 — Hepadnaviridae instead of Picornaviridae), HIV RNA nature (4807126e)
- Obstetrics & Gynecology: obesity complications (38dcb650 — infections marked as exception), BV criteria (fd03acaf — pH >4.5 marked as NOT a criterion)
- Ophthalmology: primary refractive element (Q55cfa205 — vitreous instead of cornea), POAG presentation (Qceb77783 — painful instead of painless)
- Pathology: ANCA specificity (Qbb5c26e6 — p-ANCA keyed to Wegener's instead of MPA), fibrinoid necrosis (Q8c27b657 — keyed to sarcoidosis)
- Pharmacology: vigabatrin mechanism (d2d6c7a8 — GABA agonism instead of GABA-T inhibition), esmolol in LV decompensation (e6523f6f)
- Psychiatry: catatonia classification (2abb2fc6 — still described as a schizophrenia subtype under DSM-5), Capgras vs. Fregoli (9f5c4d02)
- Radiology: Kerley B lines (187a0076 — keyed to mitral regurgitation instead of stenosis), oxygen timing in radiotherapy (f5f25b69)
- Surgery: 3-glass urine test (06d38563 — prostatitis instead of urethritis), parapharyngeal mass (a9afb7cc)
A specific subtype: legal/guideline obsolescence. Forensic Medicine has a systemic version of this problem: IPC section numbers have not been updated to the Bharatiya Nyaya Sanhita (BNS), which replaced the IPC in July 2024. At least 15–20 questions in the Forensic Medicine sample cite IPC sections as current law. This is a subject-wide sweep task, not an item-by-item fix.
Category C: Broken Image-Dependent Questions
What it is. Questions that reference a visual element — an X-ray, ECG, clinical photograph, histopathology slide, instrument, or diagram — but do not have the image attached. These questions are functionally unanswerable as text items. They are not low-quality questions; they are non-functional questions.
Where it appears. This pattern was confirmed in every subject that uses image-based questions: Anatomy, Anesthesiology, Community Medicine, Dermatology, ENT, Forensic Medicine, General Medicine, Internal Medicine, Microbiology, Obstetrics & Gynecology, Ophthalmology, Orthopaedics, Pathology, Pediatrics, Pharmacology, Physiology, Radiology, and Surgery. General Medicine is the most severely affected subject in the sample, with an estimated 40–50% of its 200-question sample either fully image-dependent without an image or containing blank option sets. Radiology has the highest structural risk given that image interpretation is the core competency of the subject.
The PYQ-tagged variant is particularly important. Several reports (Anesthesiology, ENT, Forensic Medicine, Microbiology, Radiology) identify broken image-dependent questions that carry PYQ tags — meaning they were originally valid exam questions that lost their images during content migration. These are worth prioritizing for image recovery because the question structure is otherwise sound and the exam provenance is confirmed.
Remediation. Two paths exist: (1) locate and reattach the original image, verify rendering, and re-review the full item; (2) if the image cannot be recovered, rewrite the stem as a text-based clinical description that does not require a visual. Do not deploy image-dependent questions in any form until one of these paths is completed.
Category D: Structural MCQ Construction Failures
What it is. A set of item-writing defects that undermine question validity regardless of whether the underlying content is correct. The most common variants, observed repeatedly across subjects, are:
- "All of the above" as the keyed correct answer — allows test-wise candidates to select the answer by recognizing any one correct option, without evaluating the others. Observed in Anatomy, Anesthesiology, Biochemistry, Community Medicine, Dermatology, ENT, Internal Medicine, Microbiology, Obstetrics & Gynecology, Ophthalmology, Orthopaedics, Pathology, Pediatrics, Pharmacology, Physiology, Psychiatry, Radiology, and Surgery.
- "None of the above" as the keyed correct answer — provides no positive educational content; the candidate learns only that the listed options are wrong, not what the correct answer is. Observed in Community Medicine, ENT, Obstetrics & Gynecology, Pathology, Pediatrics, Psychiatry, and Radiology.
- Duplicate options within a single question — two options with identical text, making the question unanswerable. Observed in Anatomy, Microbiology, Ophthalmology, Pathology, Physiology, Psychiatry, and Surgery.
- Compound answer options — "Options A and C are correct" as a keyed answer. Observed in Anatomy and Dermatology.
- Tautological stems — the correct answer is contained within or immediately inferable from the question stem itself. Observed in Anatomy, ENT, Ophthalmology, Radiology, and Surgery.
- Implausible distractors — wrong options that are obviously wrong to any prepared candidate, reducing the effective choice set and inflating apparent difficulty. Observed broadly across subjects.
- Exam source citations embedded in stems — "Recent NEET Pattern 2016-17" appearing as in-text attribution in the stem. Observed in Anatomy (Q-3e9fd2ac) and ENT (f02712f7 — a corrupt import artefact where options are answer-key line numbers).
The "All of the above" problem is bank-wide. This format appears as the keyed correct answer in an estimated 5–10% of questions across the reviewed subjects. It is explicitly prohibited in standard item-writing guidelines for high-stakes medical examinations and should be retired as a format across the entire bank.
Category E: Topic Misclassification and Subject Boundary Contamination
What it is. Two related but operationally distinct problems: (1) questions filed under the wrong topic within a subject, distorting topic-level analytics and causing questions to appear in wrong practice contexts; (2) questions from entirely different subjects or specialties appearing in the wrong subject pool.
Topic misclassification within subjects was observed in Anatomy (Neuroanatomy tag used as a catch-all for non-anatomy content), Biochemistry (coagulation questions under Hemoglobin and Iron Metabolism), Community Medicine (communicable disease questions under Non-Communicable Diseases), Dermatology (lichen planus under Psoriasis, pityriasis rosea under Fungal Infections), ENT (cochlear implant questions under Diseases of the Nose), Microbiology (syphilis serology under Immunology, Rickettsia under Bacteriology), Obstetrics & Gynecology (Turner syndrome under Endocrinology of Pregnancy), Pathology (salivary gland tumors under Endocrine Pathology), Pharmacology (drug interactions under Pharmacokinetics), Physiology (autonomic questions under Acid-Base Balance), and Surgery (refeeding syndrome under Trauma).
Subject boundary contamination — questions from entirely different specialties — was observed in:
- Anatomy: Pathology, Physiology, Pharmacology, Immunology, Dentistry, Surgery, and Community Medicine questions under Neuroanatomy
- Anesthesiology: dental/TMJ content (621d2586, c5d01b8c)
- Biochemistry: physics questions from school-level databases (85e1211d, 2938f797)
- Community Medicine: clinical neurology (a66b90e1 — Wernicke's encephalopathy), ophthalmology (2f42243b)
- ENT: dental/oral surgery (303bd178 — radicular cyst)
- Microbiology: pre-medical biology (c03edf76 — prokaryotes lack mitochondria, appearing in GNM/ANM templates)
- Ophthalmology: orthopaedics (Qde280ade — Jersey finger, mallet finger, boxer knuckle filed under Diseases of the Cornea)
- Orthopaedics: oral and maxillofacial surgery (eb440c5e, 3e3137fc, dce4636e, b4f62ce2)
- Pharmacology: dental pharmacology (d0c5a4f4 — antisialogogue in orthodontic bonding)
- Physiology: blood banking (4fd152c3), clinical surgery (48637953 — Zollinger-Ellison syndrome triad)
- Radiology: dental/cephalometric radiology (e383863b, e1edfa65, 9460a0b4, cc5a8852, a757b7ab), ECG interpretation (2397438f)
- Surgery: dental/prosthodontics (8d6c0a57, 371df27e), general knowledge trivia (4ac2d0b8 — Smile Train charity)
The Anatomy Neuroanatomy tag and the Surgery "General Surgery Principles" topic label are the two most clearly identified catch-all buckets in the bank. Both warrant systematic audits across their full subject pools.
Category F: Near-Duplicate and Over-Concentrated Topic Coverage
What it is. Multiple questions testing the same narrow fact at the same cognitive level, with only superficial variation in the stem. This inflates apparent coverage breadth, wastes question slots, and — when duplicates appear in the same test — distorts difficulty calibration.
Where it is most visible. Near-duplicate clustering was identified in:
- Anatomy: multiple questions on the same nerve injury syndromes and embryological derivatives
- Anesthesiology: ketamine properties (at least 4–5 questions converging on the same 2–3 facts), thiopentone induction (748c298e and 44fe3a7d are near-identical), malignant hyperthermia (3 questions in a single 25-question shard)
- Community Medicine: epidemiological time intervals (latent period and incubation period as adjacent Bloom's 1–2 items), vaccine study design
- Dermatology: herpes zoster dermatomal vignettes (at least 3 near-identical items), slapped-cheek/Parvovirus B19 (2376d4bb is a direct duplicate of PYQ f2f1596e), molluscum contagiosum (f4bea92c and 750e55d4 in the same shard)
- ENT: cholesteatoma (c4493f51 and 579d9e08), laryngomalacia (aa961777 and 6fae6224), malignant otitis externa (3 questions on the same diabetic-Pseudomonas cluster in one shard)
- Internal Medicine: multiple "All of the above" items on the same clinical topics
- Microbiology: sterilization/disinfection (6 questions in the sample, several testing the same glutaraldehyde fact), Neisseria maltose fermentation (appears as the key discriminating fact in at least 4 separate questions)
- Obstetrics & Gynecology: bacterial vaginosis criteria (fd03acaf and 50f3d261 are near-identical EXCEPT questions), placenta previa management (1fa39140 and fb6ae1db)
- Orthopaedics: chondroblastoma location (3 questions — bfa032fb, 3daf3c95, 020352db — all resolving to the same answer)
- Pediatrics: Kawasaki disease (3 candidate pool items at Bloom's 1–2 alongside 4 benchmark items at Bloom's 3–4), pyloric stenosis, rotavirus, infant of diabetic mother
- Psychiatry: Korsakoff's syndrome (8f318862 is redundant with PYQs 19fd06f6 and 51d0d375)
The standard remediation is: within each confirmed duplicate cluster, retain the highest-quality item (highest Bloom's level, best clinical framing, PYQ-tagged if available) and disable the weaker versions.
Subject-Specific Hotspots
The six broad categories above appear across the bank, but several subjects have additional problems that do not generalize and require subject-specific attention.
Forensic Medicine has a systemic legal currency failure: IPC section numbers have not been updated to the Bharatiya Nyaya Sanhita (BNS, effective July 2024). This affects an estimated 15–20 questions in the 200-question sample and likely hundreds in the full 5,504-question pool. This is a subject-wide sweep task requiring a legislative update pass, not item-by-item fixes.
General Medicine has a structural integrity crisis: an estimated 40–50% of its 200-question sample is non-functional due to missing images, blank option sets, or orphaned stems with no clinical content. This is the most severe image-integrity failure in the bank and must be resolved before any General Medicine questions are deployed.
Radiology has a high density of physics-definition and equipment-trivia questions (atomic weight, gyromagnetic properties, X-ray tube filament dimensions, pin index codes) that are pre-medical in cognitive level and have no place in a PG entrance bank. It also has the highest structural risk from broken image dependencies, given that image interpretation is the core competency of the subject.
Anatomy has a specific contamination pattern: the Neuroanatomy topic tag is being used as a catch-all for questions from Pathology, Physiology, Pharmacology, Immunology, Biochemistry, Surgery, Community Medicine, and Dentistry. A one-time audit of this tag across the 13,876-question pool is the single highest-leverage cleanup task in the subject.
Obstetrics & Gynecology has a FIGO staging version mismatch problem: questions on cervical, endometrial, and ovarian cancer staging use pre-2018 FIGO criteria that have been superseded. Treatment questions that omit substage specification (e.g., "Stage I cervical carcinoma" without specifying IA1 vs. IB2) have no single defensible correct answer. This requires a systematic oncology staging audit against current FIGO versions.
Psychiatry has a classification obsolescence problem: multiple questions still describe catatonia as a subtype of schizophrenia (superseded by DSM-5), test DSM-IV-TR multiaxial system details, or use terminology not recognized in DSM-5 or ICD-11. This is distinct from the general recall overload problem — these items are not merely low-quality, they are actively teaching the wrong classification framework.
Pharmacology has a content currency risk concentrated in antimicrobials (TB regimens referencing superseded RNTCP Category I/II/III), withdrawn drugs (propoxyphene presented as a current treatment option), and "recently approved" language applied to drugs approved years ago. These items are not merely outdated — they may reinforce dangerous clinical practice.
What Should Be Disabled First
The following represent the clearest, most urgent disable calls — items that should be removed from all active test templates before the next deployment cycle, without waiting for a full remediation sprint.
1. All questions with confirmed factually incorrect answer keys. These are actively harmful. Priority cases identified across subjects include: da0f6ddd (Anesthesiology — bupivacaine toxicity), f073adfe (Biochemistry — glycolysis ATP yield), 71053b22 (Community Medicine — HDI components), 0ebdb4c9 (Forensic Medicine — ethylene glycol antidote), 0fdbebd6 (Forensic Medicine — brain death criteria), 51534038 (Internal Medicine — acoustic neuroma nerve), dac5b285 (Microbiology — HAV family), 4807126e (Microbiology — HIV RNA), 38dcb650 (OB/GYN — obesity complications), Qbb5c26e6 (Pathology — ANCA specificity), Q55cfa205 (Ophthalmology — refractive element), 187a0076 (Radiology — Kerley B lines), 06d38563 (Surgery — 3-glass urine test), and the full list of confirmed errors in each subject report.
2. All broken image-dependent questions. Questions where the stem references a visual element that is not present are non-functional. Disable immediately for test assembly. Restoration (image recovery or text rewrite) can proceed in parallel but should not delay the disable action. General Medicine, Radiology, and ENT have the highest confirmed counts.
3. All "All of the above" keyed-correct items. This format is a known psychometric failure and should be retired bank-wide. These items can be identified by pattern matching and disabled in a single batch operation without requiring content expertise.
4. Confirmed out-of-scope contamination. Questions from entirely different specialties (dental endodontics in Surgery, orthopaedics in Ophthalmology, ECG interpretation in Radiology, school-level physics in Biochemistry, Smile Train charity trivia in Surgery) should be disabled from their current subject pools immediately. These require no editorial judgment — the subject mismatch is unambiguous.
5. Corrupt or structurally non-functional items. Items with duplicate options (making the question unanswerable), blank option sets, orphaned stems with no clinical content, or import artefacts where options are answer-key line numbers (f02712f7 in ENT) should be disabled immediately.
What Should Be Fixed Instead Of Disabled
The following categories represent the highest-return fix investments — items where the underlying concept is sound and the remediation path is clear.
Vignette upgrades for high-yield recall items. Across subjects, the most valuable fix is converting a bare-recall item on a genuinely high-yield concept into a clinical vignette that requires the candidate to apply the fact rather than retrieve it. The benchmark questions in each subject provide the template. The fix rule is consistent: add a brief clinical scenario (2–3 sentences: patient demographics, presentation, relevant finding), ensure the clinical details are load-bearing (i.e., they change which answer is correct), and verify that the distractors represent genuine clinical alternatives. Priority topics for vignette conversion: nerve injuries and brachial plexus (Anatomy), malignant hyperthermia and NMB reversal (Anesthesiology), inborn errors of metabolism (Biochemistry), applied epidemiology calculations (Community Medicine), leprosy and photodermatoses (Dermatology), red-flag presentations (ENT), asphyxial deaths and toxicology (Forensic Medicine), management of common conditions (Internal Medicine), organism-disease associations in clinical context (Microbiology), management of obstetric emergencies (OB/GYN), glaucoma and retinal vascular disease (Ophthalmology), bone tumors and trauma (Orthopaedics), lysosomal storage diseases and urea cycle disorders (Pathology), neonatal emergencies and developmental surveillance (Pediatrics), drug interactions and adverse effects (Pharmacology), applied physiology calculations (Physiology), and management of psychiatric emergencies (Psychiatry).
Answer key corrections for structurally sound questions. Where a question has a correct stem, appropriate clinical framing, and good distractors but a wrong answer key, the fix is targeted: correct the key, verify all distractors remain valid, and add a guideline citation to the explanation. This applies to the outdated management answers in Surgery (thyroid cancer, neck dissection), the classification obsolescence items in Psychiatry, the legal currency items in Forensic Medicine (IPC → BNS update), and the FIGO staging version mismatches in OB/GYN.
Structural repairs for "All of the above" and duplicate-option items. "All of the above" items where the underlying concept is high-yield should be restructured as single-best-answer questions: select the most discriminating correct statement as the answer and replace the others with plausible distractors. Duplicate-option items should have one duplicate replaced with a genuinely distinct alternative. These are mechanical fixes that do not require deep content expertise.
Topic reclassification for misclassified items. Questions that are correctly constructed but filed under the wrong topic or wrong subject require only a metadata correction — no content change. These should be batched as a tagging audit rather than handled item by item. The Anatomy Neuroanatomy tag, the Surgery "General Surgery Principles" bucket, and the Radiology dental/cephalometric cluster are the highest-priority targets.
Image restoration for PYQ-tagged broken items. Image-dependent questions that carry PYQ tags and are otherwise well-constructed should be prioritized for image recovery. If the original image can be located and verified, these items are worth restoring because their exam provenance is confirmed. If the image cannot be recovered, rewrite the stem as a text-based clinical description.
Recommended Content Team Workflow
Given the scale of the issues identified, a phased approach is recommended. The phases are ordered by urgency and by the degree of content expertise required.
Phase 1 — Immediate safety actions (before next deployment cycle, no editorial judgment required)
Run automated or semi-automated filters to identify and disable:
- All questions with "All of the above" or "None of the above" as the keyed correct answer
- All questions with image-reference language in the stem ("shown below," "as depicted," "the image shows") where no image is confirmed attached
- All questions with duplicate option text within a single item
- All questions with blank or near-blank option sets
- All questions with import artefacts (options that are answer-key line numbers, LaTeX placeholders in stems)
- All confirmed out-of-scope contamination items (dental questions in Surgery, physics questions in Biochemistry, ECG questions in Radiology)
Separately, route all questions flagged with confirmed factual errors (from the subject reports) to a subject-matter expert disable queue. These require expert sign-off before disabling, but should be held out of active test templates immediately.
Phase 2 — Expert review sprint (2–4 weeks, requires subject-matter expertise)
For each subject, convene a brief expert review of the confirmed wrong-key items identified in the subject reports. The output of this review is: (a) confirmed disable, (b) corrected answer key with guideline citation, or (c) full rewrite. No item with a disputed answer key should be re-enabled without expert sign-off.
Simultaneously, run the Forensic Medicine IPC → BNS legislative update pass. This is a subject-wide sweep that can be executed by a content reviewer with legal reference access, without requiring a clinical expert.
Run the Anatomy Neuroanatomy tag audit. Any question under this tag that does not test a nervous system structure, pathway, or function should be flagged for reclassification or removal.
Phase 3 — Systematic quality improvement (rolling, 4–12 weeks)
Execute topic-level vignette conversion passes, starting with the subjects and topics where the Bloom's gap is most severe and the exam frequency is highest. Use the benchmark questions in each subject as the explicit template. The conversion rule: the clinical details in the stem must be load-bearing — removing them should change which answer is correct.
Execute the deduplication pass for confirmed near-duplicate clusters (Kawasaki disease in Pediatrics, chondroblastoma in Orthopaedics, sterilization/disinfection in Microbiology, ketamine in Anesthesiology, etc.). Within each cluster, retain the highest-quality item and disable the rest.
Execute the FIGO staging version audit for OB/GYN oncology questions and the Psychiatry classification currency audit (DSM-5/ICD-11 alignment).
Phase 4 — New content development (ongoing)
The bank cannot reach benchmark quality through remediation alone. Several subjects have genuine coverage gaps at Bloom's 3–4 that cannot be filled by upgrading existing items — the items simply do not exist. Priority development targets identified across the reports: applied epidemiology calculations (Community Medicine), reproductive physiology application questions (Physiology), exercise physiology and sensory physiology (Physiology), leprosy and photodermatoses clinical vignettes (Dermatology), gestational trophoblastic disease management (OB/GYN), and fetal surveillance scenarios (OB/GYN). New items should be commissioned using the benchmark questions as the explicit quality template, with a mandatory Bloom's 3 minimum and a clinical scenario that does real reasoning work.
Standing policies to implement bank-wide
The following rules, if applied consistently going forward, would prevent recurrence of the most common problems identified in this audit:
- "All of the above" and "None of the above" are not acceptable as keyed correct answers in any subject.
- No question with an image reference in the stem may be published without a confirmed, rendering-verified image attachment.
- No Bloom's 3 or higher label may be applied to a question without a clinical scenario in which the clinical details are load-bearing.
- No question citing a specific law, guideline version, or drug approval status may be published without a date-stamped reference to the source.
- Any question whose correct answer is a specific numerical threshold, staging criterion, or management protocol must include a guideline citation in the explanation field.