Pediatrics Question Quality Review
Executive Summary
The 200-question candidate sample for Pediatrics reveals a subject pool with meaningful structural and content problems that are consistent across all eight reviewed shards. The core issue is not isolated bad questions but a set of recurring, operationally distinct failure modes that collectively depress the discriminatory value of the pool.
The Bloom's distribution of the candidate set (52 items at Level 1, 82 at Level 2, 49 at Level 3, 16 at Level 4, 1 at Level 5) already signals the problem: roughly 67% of sampled questions sit at Bloom's 1–2, while the benchmark gold standard operates almost entirely at Bloom's 3–4 with rich clinical vignettes. The gap is not marginal.
Six distinct issue categories emerge from the evidence. They are ordered here by operational urgency:
- Broken questions: image-dependent items with no image — a hard disabling category requiring no editorial judgment
- Wrong or unsafe answer keys — a patient-safety-adjacent accuracy problem requiring expert clinical review before any deployment
- Bloom's 1 recall overload: bare facts and isolated thresholds — the largest single volume problem, concentrated in Neonatology, Growth & Development, and Nutrition
- Recall dressed as clinical reasoning (Bloom's inflation) — vignette wrappers that do not raise cognitive demand, inflating apparent quality
- Structurally defective option sets — "All of the above," "None of these," and fabricated distractors that destroy discrimination
- Topic saturation and near-duplicate coverage — Kawasaki disease, pyloric stenosis, rotavirus, and IDM appear repeatedly at low cognitive levels without differentiation
Across the 200-question sample, an estimated 25–30 questions should be disabled immediately (broken, factually wrong, or trivially below bar), approximately 60–70 require targeted fixes of varying complexity, and roughly 50–60 are serviceable as-is or with minor polish. The remainder are borderline and should be reviewed against topic coverage needs before a final disposition.
What Good Looks Like
The benchmark set and recent PYQs establish a clear quality bar. Every benchmark item shares the following properties:
Clinical vignette with sufficient discriminating detail. The stem provides age, duration of illness, key examination findings, and at least one investigation result. The clinical picture is specific enough that a candidate who does not understand the underlying pathophysiology cannot guess the answer from surface features. Compare the benchmark Kawasaki item (Q-4f17d182) — which specifies day 18 of illness, recurrent fever after IVIG, new coronary dilatation with a Z-score, and a CRP value — against the candidate pool's Q-82ae9600, which presents a textbook enumeration of Kawasaki features and asks for the diagnosis. The former requires management reasoning; the latter requires only recognition.
Distractors that represent genuine clinical alternatives. In the benchmark Langerhans cell histiocytosis question (Q-0e4aa6cb), all four options are real treatment modalities used in LCH at different disease stages. A candidate who does not know the staging criteria cannot eliminate any option on implausibility grounds alone. This is the standard the candidate pool should meet.
The correct answer is unambiguous and defensible against current guidelines. The benchmark neonatal sepsis question (Q-343e7bcf) specifies gram-positive cocci in chains, early-onset presentation, and asks for the antibiotic combination — the answer (ampicillin + gentamicin) is unambiguous and guideline-concordant. There is no competing defensible answer.
Bloom's level matches the cognitive demand. Benchmark items labeled Bloom's 3 require the candidate to apply knowledge to a novel clinical situation, not retrieve a memorized fact. The DMD sibling question (Q-dbe47e0e) is labeled Bloom's 3 and correctly so: the candidate must reason that an asymptomatic sibling of a DMD proband requires proactive investigation, not reassurance or prophylactic treatment.
Appropriate difficulty calibration. Even the "easy" benchmark items (flagged as such) use a vignette format and test a clinical decision, not a definition. The epiglottitis question (Q-e8b7fc20) is flagged easy but still requires the candidate to choose between four plausible-sounding management options, one of which (direct laryngoscopy) is actively dangerous.
The candidate pool should be evaluated against this bar, not against a lower internal standard.
Main Issue Categories
1. Broken Questions: Image-Dependent Items Submitted Without Images
Why this pattern is bad
These questions are not merely weak — they are non-functional. A candidate reading the text cannot answer the question because the discriminating information exists only in a missing image. Deploying these items in any test context produces random responses, not measurement. They also cannot be fixed editorially without the original image asset.
How it shows up
The pattern appears in at least eight confirmed instances across the reviewed shards. The trigger phrases are consistent: "the following instrument is used for," "the given ECG shows," "the MRI scan of his head is shown," "as shown below," "the following findings," and "the child can do this activity at which age." In every case the stem is either entirely empty of clinical content or contains only a thin wrapper that is insufficient to answer the question without the visual.
The affected question types cluster into three subtypes:
- Radiology/imaging questions (MRI, ECG, X-ray): Q-fa0dc3b6 (craniopharyngioma MRI), Q-ccc2445d (neonatal ECG in maternal SLE)
- Instrument/equipment identification questions: Q-7cec58f8 ("the following instrument is used for")
- Developmental milestone activity questions: Q-2557f818 ("the child can do this activity at which age"), Q-9506cbfe (child abuse — "following findings")
- Clinical photograph questions: Q-ac1de03d (lip lesions "as shown below"), Q-716da2fa (obesity evaluation — "following finding is observed")
Example question IDs with short explanations
- Q-fa0dc3b6: Stem reads "The MRI scan of his head is shown." No image present. The question is about craniopharyngioma but without the MRI the candidate has no basis for answering. Non-functional.
- Q-ccc2445d: "The given ECG of a neonate born to a mother with SLE shows?" No ECG attached. The clinical context (maternal SLE, neonatal heart block) is inferable but the specific ECG finding being tested cannot be evaluated from text alone.
- Q-7cec58f8: "The following instrument is used for:" — no image, no description of the instrument. Completely unanswerable.
- Q-2557f818: "The child can do this activity at which age?" — no image or description of the activity. The stem contains zero clinical information.
- Q-9506cbfe: "What is the probable cause for the following findings?" — no image, no clinical description. The correct answer (benign skin lesion) cannot be evaluated.
- Q-716da2fa: "During evaluation of a child with obesity, following finding is observed" — references a missing image that would show polydactyly or retinal dystrophy for Laurence-Moon-Bardet-Biedl syndrome.
Recommended disposition: Disable all
These items cannot be fixed editorially. They require the original image asset to be located, verified for accuracy, and properly embedded. Until that is confirmed, every item in this category must be disabled. Do not attempt to rewrite these as text-only questions without subject-matter expert involvement, as the image was the primary discriminating element.
2. Wrong or Clinically Unsafe Answer Keys
Why this pattern is bad
This is the highest-severity content problem in the sample. Questions with incorrect correct answers do not merely fail to measure knowledge — they actively teach wrong clinical information. In a subject like Pediatrics, where several questions touch on resuscitation protocols, emergency management, and diagnostic thresholds, a wrong key can reinforce dangerous practice patterns in candidates who are preparing for clinical work. This category requires expert clinical review before any item is deployed, regardless of other quality attributes.
How it shows up
The pattern appears across multiple topics and is not confined to a single shard or topic area. The errors fall into three subtypes:
Subtype A — Management guideline violations: The correct answer contradicts current standard-of-care guidelines.
- Q-4ea331d8: Marks "IV high-dose dexamethasone" as correct for Dengue Shock Syndrome. Current WHO and IAP guidelines recommend IV crystalloid (isotonic saline or Ringer's lactate) as first-line; steroids are not recommended and may be harmful. This is a direct guideline violation.
- Q-11be839e: Marks "trickling of the sole" (tactile stimulation) as the correct next step after three failed resuscitation attempts in a meconium aspiration scenario. Per NRP guidelines, after failed stimulation the next step is positive-pressure ventilation. The correct answer is wrong and teaches a dangerous sequence.
- Q-0270de64: Marks "intravenous fluids" as the most appropriate initial management for intussusception in a hemodynamically stable 6-year-old. Pneumatic or air enema reduction under fluoroscopic guidance is the standard first-line definitive treatment. IV fluids are part of resuscitation but not the answer to "most appropriate initial management."
Subtype B — Pathophysiology/diagnosis inversions: The correct answer is factually wrong for the clinical scenario described.
- Q-85cbee3b: Presents a 9-month-old with Hb 3.8 g/dL, target cells, normoblasts, skull X-ray showing marrow expansion, and onset at 6 months — a textbook picture of β-thalassemia major — but marks "iron deficiency anemia" as correct. Iron deficiency does not cause skull X-ray changes or normoblasts at this severity in an infant.
- Q-51829a39: Asks which condition causes conjugated hyperbilirubinemia in infancy and marks "Gilbert disease" as correct. Gilbert disease causes unconjugated hyperbilirubinemia. Choledochal cyst and biliary atresia (listed as distractors) are the actual correct answers.
- Q-2e485771: Marks "flail mitral valve" as correct for an ASD patient with a mitral regurgitation murmur and left axis deviation. The expected answer for ostium primum ASD is the ASD subtype itself, not a valve pathology.
- Q-8f334db0: States the correct answer is "most common in diaphysis" for neonatal osteomyelitis. Standard pediatric references describe metaphyseal involvement as predominant; calling diaphysis "most common" is non-standard and likely incorrect.
- Q-2ff4f881: Marks "diaphragmatic hernia" as the cause of the double bubble sign. Double bubble sign is caused by duodenal atresia, annular pancreas, or Ladd's bands — not diaphragmatic hernia.
Subtype C — Inverted logic in EXCEPT/FALSE format: The marked answer is actually true, not false.
- Q-751e677a: The marked correct answer states "autosomal recessive" is false for Tangier disease — but Tangier disease IS autosomal recessive (ABCA1 mutations). The logic is inverted.
- Q-ac4552e2: Marks "None of these" as the answer for treatment of virilizing adrenal hyperplasia. The correct treatment (glucocorticoid replacement with hydrocortisone) is absent from the options entirely.
Example question IDs with short explanations
See above. Priority cases for immediate expert review: Q-4ea331d8, Q-11be839e, Q-85cbee3b, Q-51829a39, Q-751e677a.
Recommended disposition
- Q-4ea331d8, Q-11be839e, Q-0270de64, Q-85cbee3b, Q-51829a39, Q-2e485771, Q-8f334db0, Q-2ff4f881, Q-751e677a, Q-ac4552e2: Fix with mandatory expert clinical review before redeployment. The fix path for each is different — some require a corrected answer key, some require a rewritten vignette, some require both. None should be deployed in their current state.
- Items where the vignette itself is sound and only the key needs correction (Q-85cbee3b, Q-51829a39) are higher-priority fixes because the stem investment is recoverable.
- Items where the vignette and key are both wrong (Q-2ff4f881) should be disabled and rebuilt from scratch.
3. Bloom's 1 Recall Overload: Bare Facts, Isolated Thresholds, and Eponym Lookups
Why this pattern is bad
This is the largest single volume problem in the sample. Pure recall questions — single-fact lookups with no clinical context — do not measure the clinical reasoning skills that INI-CET and NEET-PG are designed to assess. They also provide no discriminatory value at PG level because any candidate who has read a textbook once can answer them. When these items cluster in a topic area, they create the illusion of coverage while actually measuring nothing useful. The benchmark set demonstrates that even "easy" items should use a brief clinical scenario.
How it shows up
This pattern appears in every shard and is concentrated in four topic areas: Neonatology, Growth & Development, Pediatric Nutrition, and Infectious Diseases (immunization schedule). The items share a common structure: a one-line stem asking for a definition, a threshold value, an eponym, or a single association, with no patient age, no clinical context, and no reasoning required.
Subtypes observed in this sample:
Numeric threshold recall: Q-eb9cda41 (ELBW definition — <1000 g), Q-562076b0 (apnea definition — 20 seconds), Q-b4308e8a (moderate hypothermia temperature range), Q-d63a644e (neonatal hyperglycemia threshold), Q-75936207 (polycythemia hematocrit cutoff — also factually wrong for the pediatric context), Q-e0b6b0e0 (respiratory rate threshold for pneumonia in a 3-year-old).
Developmental milestone recall: Q-b88c5b7c (age for drawing a circle), Q-341369bb (3-year-old milestone), Q-e313935c (primary dentition age — 2.5 years), Q-9c045f45 (first-trimester ultrasound finding in Down syndrome).
Eponym and single-association recall: Q-f3054f5a (infantile cortical hyperostosis = Caffey disease), Q-a73eac1d (Koplik spots = measles), Q-c050f251 (Trisomy 13 = Patau syndrome), Q-22e7ded5 (most common type of cerebral palsy = spastic), Q-d9ab69a8 (most common presentation of Down syndrome = cognitive impairment), Q-1dd3e96c (HHV-6 = exanthema subitum).
Neonatal physiology trivia: Q-b5e0c769 (normal cardiac output of a newborn — 350 ml/kg/min), Q-7cbcd28f (glycogen stores in prematurity), Q-8bf99743 (25 cm length gain in first year), Q-2de13935 (most common immunoglobulin in colostrum = IgA).
Immunization schedule recall: Q-ee888fdc (influenza vaccine age = 6 months), Q-8ce81380 (in-utero rubella = IgM), Q-67d74a85 (rotavirus as most common cause of diarrhea — bare factual).
Example question IDs with short explanations
- Q-eb9cda41: "ELBW is defined as birth weight less than ___." Four options with similar weight cutoffs. No clinical scenario, no reasoning required. A flashcard, not an MCQ.
- Q-b88c5b7c: "At what age can a child draw a circle?" Single developmental milestone, no developmental surveillance context, no competing clinical interpretation.
- Q-f3054f5a: "Infantile cortical hyperostosis is also known as ___." Pure eponym recall. A candidate who has never seen a patient with this condition can answer it from a mnemonic.
- Q-74cd57e1: "Best indicator for growth measurement = weight." Decontextualized and factually debatable (height/length is preferred for longitudinal growth assessment).
- Q-756c89e8: "Most common presentation of congenital hypothyroidism." Flagged easy and Bloom's 1. Asymptomatic at birth is correct but this is a bare recall item with no clinical utility.
Recommended disposition
The disposition depends on whether the underlying concept has clinical value at PG level:
- Items testing concepts that are genuinely high-yield but currently formatted as bare recall (apnea definition, ELBW, hypothermia grades, developmental milestones): Fix by converting to a brief clinical vignette. For example, Q-562076b0 (apnea definition) can be reframed as a preterm infant whose monitor alarm triggers — the candidate must decide whether the pause constitutes apnea requiring intervention. This preserves the content while adding clinical utility.
- Items testing concepts that are low-yield trivia at PG level (cardiac output of a newborn in ml/kg/min, glycogen stores in prematurity, primary dentition age): Disable. These facts do not appear in INI-CET or NEET-PG PYQs and have no clinical decision-making relevance.
- Items that are both low-yield and factually wrong (Q-75936207 — polycythemia hematocrit cutoff conflating adult and neonatal values): Disable immediately.
- Eponym recall items (Caffey disease, Koplik spots, Patau syndrome): Disable. If the concept must be tested, it should appear as a clinical vignette with the phenotype described, not as a name-to-name association.
The content team should treat Neonatology and Growth & Development as the highest-priority topic areas for a Bloom's upgrade pass.
4. Recall Dressed as Clinical Reasoning: Bloom's Inflation in Vignette-Wrapped Items
Why this pattern is bad
This category is distinct from Category 3 because the questions do have a clinical vignette — they are not bare one-liners. The problem is that the vignette provides no additional discriminating information beyond what a single memorized fact would supply. The clinical wrapper creates the appearance of Bloom's 3–4 reasoning while the actual cognitive demand is Bloom's 1–2. This matters operationally because these items pass surface-level quality checks but fail to measure clinical reasoning, and they inflate the apparent Bloom's distribution of the pool.
How it shows up
The pattern is most common in Cardiology, Neonatology, and Infectious Diseases. The diagnostic tell is that the vignette presents a textbook-classic constellation of findings with no ambiguity, no competing diagnosis, and no management decision required — the answer is the name of the condition or a single associated fact.
- Classic constellation → diagnosis: Q-82ae9600 (Kawasaki disease — fever, conjunctival injection, strawberry tongue, rash, lymphadenopathy → "what is the diagnosis?"). The diagnosis is immediately obvious from the stem. No reasoning is required. Labeled Bloom's 4.
- Classic presentation → single associated fact: Q-bea5b00f (coarctation of aorta — absent femoral pulses, upper limb hypertension → "what is the diagnosis?"). Labeled Bloom's 3. This is Bloom's 2 pattern recognition at best.
- Textbook symptom list → eponym: Q-fb8e91f7 (pheochromocytoma — hypertension, sweating, palpitations, headache → "what is the diagnosis?"). Labeled Bloom's 3. The symptom list is a direct textbook enumeration with no ambiguity.
- Single-sign vignette: Q-3323847b (aortic stenosis murmur location — "a child has a harsh systolic murmur at the right upper sternal border" → "what is the diagnosis?"). One clinical sign, one answer.
- IDM → hypoglycemia: Q-7f61dd15 (infant of diabetic mother, jitteriness, low glucose → "what is the complication?"). The answer is the only metabolic complication mentioned in the stem.
- Chromosomal deletion → syndrome: Q-fff0f8c6 (cri-du-chat — high-pitched cry, microcephaly → "what is the chromosomal abnormality?"). The syndrome name is in the clinical description; the question asks for the deletion.
Example question IDs with short explanations
- Q-82ae9600: Five classic Kawasaki criteria listed in the stem, labeled Bloom's 4. The question asks for the diagnosis. This is Bloom's 2 at most. It also thematically overlaps with four Kawasaki items in the benchmark and PYQ sets.
- Q-bea5b00f: "Absent femoral pulses and upper limb hypertension in a child" — this is the textbook definition of coarctation. Labeled Bloom's 3. Should be Bloom's 2.
- Q-a35ede96: Edwards syndrome features (dolichocephaly, rocker-bottom feet, overlapping fingers) → "which cardiac lesion is associated?" This is a reasonable question but the Bloom's 3 label is generous; it is a memorized association.
- Q-addd0a83: "Most common intra-abdominal solid organ tumor in children" — labeled Bloom's 4. This is a single-fact recall item. The Bloom's 4 label is unjustified by any reasonable interpretation.
Recommended disposition
These items are not disabled on accuracy grounds — most are factually correct. The issue is cognitive level and exam relevance.
- Items where the concept is high-yield and the vignette is structurally sound but the cognitive demand is too low: Fix by adding a management or complication decision layer. For example, Q-82ae9600 (Kawasaki diagnosis) should be reframed as a management or IVIG-resistance question. Q-bea5b00f (coarctation) should be reframed to ask about the investigation of choice or the timing of intervention.
- Items where the concept is low-yield or the vignette is too thin to be worth rewriting: Disable. Q-addd0a83 (most common intra-abdominal tumor) is a pure recall fact that can be embedded in a richer oncology vignette if needed.
- All items in this category should have their Bloom's level corrected downward to reflect actual cognitive demand, regardless of other disposition decisions.
5. Structurally Defective Option Sets
Why this pattern is bad
Option quality is as important as stem quality. Three specific option construction failures appear repeatedly in this sample and each has a different mechanism of harm:
- "All of the above" as the correct answer eliminates discrimination entirely. A candidate who recognizes any one of the three individual options as correct can select "All of the above" without evaluating the others. The question measures test-taking strategy, not knowledge.
- "None of these" as the correct answer is non-specific and forces the candidate to evaluate all options as incorrect, which is a valid format only when the correct answer is genuinely absent — but in the observed cases, the correct answer (e.g., hydrocortisone for CAH) simply was not included in the option set, making the question a construction failure rather than a deliberate "none of the above" design.
- Fabricated or implausible distractors reduce discrimination by making the correct answer obvious through elimination. When one distractor is a non-existent syndrome ("Paul–Bunnel syndrome," "De–pan syndrome" in Q-44c6af91) or an absurd clinical claim ("occurs only in male children" for SIDS in Q-d7bb2e01), the question becomes a two-option choice at best.
How it shows up
- "All of the above" correct answers: Q-5b446edf (drugs in neonatal resuscitation), Q-310f631c (pulmonary surfactant — three incomplete sentence fragments plus "All of the above"), Q-ae49aaf1 (hypothermia prevention), Q-2aa0a8c6 (delayed puberty — "All of the options"), Q-277ad283 (vaccines for unimmunized 8-year-old).
- "None of these" correct answers: Q-ac4552e2 (virilizing adrenal hyperplasia treatment — the actual correct answer, hydrocortisone, is absent from the options).
- Fabricated distractors: Q-44c6af91 (nuchal fold thickness — "Paul–Bunnel syndrome" and "De–pan syndrome" are not recognized pediatric syndromes), Q-d7bb2e01 (SIDS — "occurs only in male children" is an absurd distractor).
- Overlapping options: Q-6fc883be (breastfeeding and enteric infection — option B "nutrients and immunological superiority" is a superset of option C "immunoglobulin content," making both simultaneously defensible).
- "All of the above" with incomplete option text: Q-310f631c has options that are sentence fragments ("Pulmonary surfactant itself") that do not form complete statements, making the question structurally incoherent.
Example question IDs with short explanations
- Q-5b446edf: "Drugs used in neonatal resuscitation include: A, B, C, D. All of the above." Bloom's 1, easy-flagged. The "All of the above" format means any candidate who knows one drug is used can select the correct answer. No discrimination.
- Q-ac4552e2: The correct treatment for virilizing CAH (hydrocortisone) is not among the four options. "None of these" is marked correct. This is a construction failure, not a deliberate design choice.
- Q-310f631c: Options are incomplete sentence fragments. The question is structurally incoherent and tests nothing.
- Q-44c6af91: Two of four distractors are not recognized medical syndromes. The question reduces to a two-option choice.
Recommended disposition
- Q-310f631c, Q-ac4552e2: Disable. The construction failures are too fundamental to fix without a complete rewrite, and the underlying concepts are covered by better items elsewhere.
- Q-5b446edf, Q-ae49aaf1, Q-2aa0a8c6: Disable. "All of the above" as the correct answer in a Bloom's 1 item adds no value. If the concept must be tested, rewrite as a "which of the following is NOT used" format or embed in a clinical scenario.
- Q-277ad283: Fix. The concept (catch-up vaccination in an older unimmunized child) is clinically relevant. Rewrite with a single clearly excluded vaccine and remove "All of the above."
- Q-44c6af91: Disable. Fabricated distractors cannot be fixed without a full rewrite, and the underlying concept (nuchal fold thickness in Down syndrome) is trivially easy at Bloom's 1.
- Q-6fc883be: Fix. Replace the overlapping option with a clearly distinct mechanism to eliminate ambiguity.
6. Topic Saturation and Near-Duplicate Coverage
Why this pattern is bad
When the same clinical concept is tested multiple times at the same cognitive level with the same correct answer, the pool wastes question slots that could cover under-represented high-yield topics. More importantly, near-duplicates with slightly different stems but the same answer create a false sense of coverage breadth. In this sample, the saturation problem is most severe for Kawasaki disease, pyloric stenosis, rotavirus diarrhea, and infant of diabetic mother — all of which appear multiple times at Bloom's 1–2 without differentiation.
How it shows up
Kawasaki disease: The benchmark set alone contains four Kawasaki items (Q-4f17d182, Q-194c07d0, Q-b4c1438d, Q-54912a50). The candidate pool adds Q-82ae9600 (diagnosis), Q-3cc6fc33 (NOT a component), Q-0bd6bc16 (straightforward recall), and Q-f1913f5c (NOT a major criterion for rheumatic fever — adjacent topic). The benchmark items are differentiated by clinical scenario (initial treatment vs. IVIG resistance vs. criterion identification). The candidate pool items are not differentiated — they all test the same diagnostic recognition at Bloom's 1–2.
Pyloric stenosis: Q-9087bb1f (NOT seen in pyloric stenosis) and Q-3229855f (true statement about pyloric stenosis) appear in the same shard and overlap substantially in content. Q-c04e26ff (projectile vomiting, palpable mass) is a better-constructed item that covers the same ground.
Rotavirus diarrhea: Q-67d74a85 (bare factual — rotavirus is most common cause) and Q-6301274b (thin vignette wrapper — same answer) are near-duplicates within a single shard. Both test the same single fact.
Infant of diabetic mother: Q-7f61dd15 (IDM → hypoglycemia) and Q-2b35c157 (NOT observed in IDM) appear in the same shard and together with other IDM items across the pool. The concept is important but the coverage is repetitive at low cognitive levels.
Neonatal jaundice thresholds and criteria: Multiple questions across shards test pathological jaundice criteria, breast milk jaundice mechanism, and bilirubin thresholds at Bloom's 1–2. The PYQ set already contains Q-b0622be3 (breast milk jaundice mechanism) at an appropriate level.
Example question IDs with short explanations
- Q-82ae9600 + Q-3cc6fc33 + Q-0bd6bc16: Three Kawasaki items in the candidate pool, all at Bloom's 1–2, all testing diagnostic recognition or criterion recall. The benchmark already has four Kawasaki items at Bloom's 3–4. The candidate pool items add no incremental value.
- Q-67d74a85 + Q-6301274b: Near-identical rotavirus questions in the same shard. One must be disabled; the other should be upgraded to test management (ORS, zinc supplementation) rather than diagnosis.
- Q-9087bb1f + Q-3229855f: Pyloric stenosis duplicates in the same shard. Q-c04e26ff is the better item and covers the same concept at a higher cognitive level.
Recommended disposition
- For Kawasaki disease: Disable Q-82ae9600, Q-3cc6fc33, and Q-0bd6bc16 from high-stakes templates. The benchmark set provides adequate Kawasaki coverage at the appropriate level. If additional Kawasaki items are needed, they should test IVIG resistance management or coronary artery complication grading — not diagnostic recognition.
- For rotavirus: Disable Q-67d74a85 (bare factual). Fix Q-6301274b by reframing to test management (ORS composition, zinc dose, when to refer) rather than diagnosis.
- For pyloric stenosis: Disable Q-9087bb1f. Keep Q-c04e26ff and Q-3229855f after verifying they test distinct aspects (diagnosis vs. metabolic consequence).
- For IDM: Audit the full pool for IDM items and retain only those that test management decisions (glucose monitoring protocol, timing of first feed, echocardiography indication) rather than simple complication identification.
- The content team should run a topic-frequency audit across the full 7,754-question Pediatrics pool for Kawasaki disease, pyloric stenosis, rotavirus, and IDM before the next template build.
7. Factual Currency and Guideline Concordance Failures in Threshold and Protocol Questions
Why this pattern is bad
This category is distinct from Category 2 (wrong answer keys) because the questions are not straightforwardly wrong — they reflect older guidelines, contested evidence, or context-dependent recommendations that have been presented as universal facts. Deploying these items teaches outdated practice and may conflict with what candidates encounter in current IAP, WHO, or NRP guidelines. The harm is subtler than a clearly wrong key but operationally significant because these items are harder to identify without clinical expertise.
How it shows up
- Breast milk storage: Q-aecf94e5 marks 24 hours as the correct refrigerator storage time for expressed breast milk. Current CDC and WHO guidelines state up to 4 days (96 hours) at 4°C. The correct answer is outdated by at least a decade.
- Neonatal oxygen management: Q-08388114 (neonate with SpO2 80%, GDM mother) marks "100% O2" as correct. In current neonatal practice, unrestricted 100% O2 is not recommended for term neonates; titrated oxygen to target SpO2 is the standard. The answer reflects pre-2010 practice.
- Congenital CMV diagnosis: Q-3d9d779c marks CMV IgM on neonatal serum as the gold standard. Current guidelines recommend urine or saliva PCR/viral culture within 3 weeks of birth as the gold standard. Serology is less sensitive and specific in neonates.
- Breastfeeding in HBsAg-positive mothers: Q-e151bf04 marks "active hepatitis B" as NOT a contraindication to breastfeeding. This is correct per current WHO and IAP guidelines (breastfeeding is encouraged after neonatal immunoprophylaxis), but the framing is ambiguous and the stem lacks the context of immunoprophylaxis having been given, making the answer guideline-dependent in a way that is not disclosed.
- Probiotic organism in breast milk: Q-bbde451a marks Lactobacillus as the primary probiotic organism in breast milk. Current evidence favors Bifidobacterium as the dominant probiotic genus in breast milk and the breastfed infant gut.
- Rotavirus vaccine age cutoff: The PYQ Q-dafbd73f tests "cannot be given after 6 months of age" as the correct answer. The actual upper age limit for the first dose is 15 weeks (Rotarix) or 12 weeks (RotaTeq), and the series must be completed by 32 weeks — the "6 months" framing is a simplification that may be guideline-version dependent.
Example question IDs with short explanations
- Q-aecf94e5: Breast milk refrigerator storage — 24 hours is the marked correct answer; current guidelines say up to 4 days. This will teach candidates an outdated and unnecessarily restrictive practice.
- Q-08388114: 100% O2 for a term neonate with SpO2 80% — reflects pre-2010 NRP guidelines. Current NRP recommends starting with 21% O2 for term neonates and titrating.
- Q-3d9d779c: CMV IgM as gold standard for congenital CMV — outdated. PCR is now standard.
- Q-bbde451a: Lactobacillus as primary breast milk probiotic — not supported by current evidence.
Recommended disposition
- Q-aecf94e5, Q-08388114, Q-3d9d779c, Q-bbde451a: Fix with mandatory clinical currency review. Each requires verification against the most recent IAP/WHO/NRP guidelines and correction of the answer key or stem before deployment.
- Q-e151bf04: Fix by adding the context of neonatal immunoprophylaxis having been administered, or replace with a cleaner example of a non-contraindication (e.g., maternal CMV in a term infant, or maternal hepatitis C).
- The content team should establish a guideline-version tagging practice for all Pediatrics questions that reference specific thresholds, schedules, or protocols, so that currency can be audited systematically as guidelines are updated.
Prioritization
The six issue categories are not equal in urgency. The following prioritization is recommended for the content operations team:
Tier 1 — Immediate action required (no deployment until resolved)
Broken image-dependent questions (Category 1): Disable all confirmed instances immediately. No editorial review needed — these are non-functional by definition. Estimated count in the 200-question sample: 8–10 items.
Wrong or clinically unsafe answer keys (Category 2): Flag all confirmed instances for expert clinical review before any deployment. Priority cases: Q-4ea331d8, Q-11be839e, Q-85cbee3b, Q-51829a39, Q-751e677a, Q-2ff4f881, Q-0270de64. Estimated count in the 200-question sample: 10–15 items. Given the subject pool size of 7,754 questions, the subject-level prevalence of this error type warrants a systematic audit pass, not just spot fixes.
Tier 2 — High-priority fix or disable (resolve before next template build)
Structurally defective option sets (Category 5): Disable "All of the above" correct-answer items and "None of these" items. Fix overlapping options. Estimated count: 8–12 items. These are quick to identify and quick to resolve.
Factual currency and guideline concordance failures (Category 7): Fix with clinical expert review. Estimated count: 6–10 items. These require subject-matter expertise to resolve but are individually straightforward once identified.
Tier 3 — Systematic quality improvement (address in rolling content improvement cycle)
Bloom's 1 recall overload (Category 3): The largest volume problem. Estimated 40–50 items in the 200-question sample qualify for disable or vignette conversion. Prioritize Neonatology and Growth & Development topic areas first. This is a multi-week content operation, not a single-pass fix.
Bloom's inflation in vignette-wrapped items (Category 4): Estimated 20–25 items. Fix by adding a management or complication decision layer to the stem. Lower urgency than Categories 1–2 because these items are not harmful, just low-value.
Topic saturation and near-duplicates (Category 6): Requires a topic-frequency audit across the full 7,754-question pool before dispositions can be finalized. Kawasaki disease, pyloric stenosis, rotavirus, and IDM should be the first four topics audited.
Example Keep / Fix / Disable Calls
The following table summarizes representative disposition calls from across the reviewed shards. These are illustrative of the categories above and are not exhaustive.
KEEP — as-is or with minor polish
| Question ID | Topic | Reason to Keep |
|---|---|---|
| Q-d8a73e58 | Genetics | Well-constructed Prader-Willi vignette, Bloom's 4, plausible chromosomal distractors |
| Q-ed0a3b5c | Rheumatology | Clean HSP triad vignette, correct key, meaningful distractors (Kawasaki, SLE, JIA) |
| Q-5d86ee43 | Nutrition | Multi-statement SAM criteria question, Bloom's 4, tests all three WHO criteria |
| Q-6f44c2ea | Neurology | Sturge-Weber/GNAQ vignette, age-appropriate features, Bloom's 3 |
| Q-6540b090 | Nephrology | IgA nephropathy EXCEPT question with clinically meaningful distractors, Bloom's 4 |
| Q-c4bbebf2 | Hematology | APL/DIC vignette with CBC and coagulation data, genuine Bloom's 4 reasoning |
| Q-44c44fa2 | Nephrology | PSGN vignette with C3 normalization timeline, clinically grounded, Bloom's 3 |
| Q-f02a85c4 | Infectious Diseases | Rabies re-exposure management, multi-variable integration, Bloom's 3 |
| Q-43ee5828 | Respiratory | CF multi-system vignette, plausible distractors, Bloom's 3 |
| Q-bbe71f0c | Neurology | Friedreich's ataxia with GAA repeats, cardiac and metabolic integration, Bloom's 3 |
| Q-c0532543 | Neonatology | NRP decision threshold (HR 88 after PPV), applied Bloom's 3, unambiguous key |
| Q-c04e26ff | Surgery/GI | Pyloric stenosis vignette with family history, appropriate distractors, Bloom's 3 |
| Q-8620c862 | Critical Care | Diethylene glycol poisoning, multi-step reasoning, Bloom's 3 |
| Q-b3ac6425 | Immunology | Job's syndrome triad, appropriate distractors (WAS, Nezelof), Bloom's 3 |
| Q-6afb3145 | Endocrinology | DKA management — 3% saline NOT required, genuine applied reasoning, Bloom's 3 |
| Q-8d3f905a | Surgery | TEF with polyhydramnios and failed NG tube, clinically relevant distractors |
| Q-f0468abb | Neonatology | Post-hypoglycemia correction — which intervention is detrimental, Bloom's 3 |
FIX — specific remediation required
| Question ID | Issue | Recommended Fix |
|---|---|---|
| Q-4ea331d8 | Wrong key: dexamethasone for dengue shock | Change correct answer to IV crystalloid; revise distractors to reflect WHO guidelines |
| Q-85cbee3b | Wrong key: IDA for thalassemia major picture | Change correct answer to β-thalassemia major or rewrite vignette to fit IDA |
| Q-51829a39 | Wrong key: Gilbert disease for conjugated hyperbilirubinemia | Reassign correct answer to choledochal cyst or biliary atresia; replace Gilbert with Alagille as distractor |
| Q-11be839e | Wrong key: tactile stimulation after failed resuscitation | Change correct answer to bag-mask ventilation; clarify resuscitation sequence in stem |
| Q-751e677a | Inverted logic: AR marked false for Tangier disease | Replace option C with a genuinely false statement; verify all options against references |
| Q-0270de64 | Wrong key: IV fluids as initial management for intussusception | Reframe stem to ask about resuscitation specifically, or change correct answer to enema reduction |
| Q-9ce33fbe | Factual error: TdT- in ALL | Correct TdT- to TdT+ for pre-B ALL; verify immunophenotype consistency with clinical scenario |
| Q-aecf94e5 | Outdated guideline: 24-hour breast milk storage | Update to current CDC/WHO guideline (up to 4 days at 4°C) |
| Q-08388114 | Outdated practice: 100% O2 for term neonate | Update to current NRP recommendation (21% O2, titrate to SpO2 target) |
| Q-3d9d779c | Outdated diagnostic: CMV IgM as gold standard | Update correct answer to urine/saliva PCR within 3 weeks of birth |
| Q-dea83ff4 | Ambiguous "which is true" format, debatable key | Rewrite as clinical vignette (post-streptococcal nephritis) with management or pathophysiology question |
| Q-bc0c20bd | Two simultaneously correct options (HIV vertical transmission) | Reframe as "most important single intervention during labor" or restructure as management-priority vignette |
| Q-82ae9600 | Bloom's inflation: Kawasaki diagnosis at Bloom's 4 | Reframe as IVIG-resistance or coronary complication management question |
| Q-95b97ba1 | Factual error: PDA described as pansystolic murmur | Correct to "continuous machinery murmur best heard at left infraclavicular area" |
| Q-e4c1215d | Editorial artifact in stem: "(Recent NEET Pattern 2016-17)" | Remove editorial tag; verify distractor accuracy (Type 1 is most common OI, not Type 3) |
| Q-277ad283 | "All of the above" + guideline-ambiguous content | Rewrite with single clearly excluded vaccine; remove "All of the above" |
| Q-6301274b | Near-duplicate of Q-67d74a85 (rotavirus diagnosis) | Reframe to test management (ORS, zinc) rather than diagnosis |
| Q-341369bb | Bloom's 1 milestone recall | Embed in developmental surveillance vignette: "which finding would be expected at this age?" |
| Q-55ea83f5 | Bloom's 2, easy-flagged nutrition question | Add brief clinical vignette (age, feeding history, examination finding) to reach Bloom's 3 |
| Q-7951212b | Hypernatremic dehydration — "brain hemorrhage" needs context | Specify "most devastating structural complication" to make answer unambiguous over seizures |
| Q-688814b8 | Biochemistry question misframed as clinical endocrinology | Rewrite as pure clinical endocrinology question about cretinism management, or rewrite biochemistry stem with precise transporter distinction |
| Q-d9cb362e | BAL as next investigation before imaging/sweat chloride | Revise stem to specify prior workup done, or change correct answer to HRCT/sweat chloride with rationale |
| Q-ccf4fb74 | Stem conflates small bowel obstruction with duodenal obstruction | Fix stem to specify "duodenal obstruction" if duodenal atresia is intended |
| Q-5cf52017 | Broken distractor: "WHZ less than 25 Z-scores" | Fix distractor text to "WHZ less than −3 Z-scores" or replace with MUAC threshold |
DISABLE — remove from active pool
| Question ID | Reason to Disable |
|---|---|
| Q-df6b1547 | No stem whatsoever — completely non-functional |
| Q-fa0dc3b6 | Image-dependent (craniopharyngioma MRI), no image present |
| Q-ccc2445d | Image-dependent (neonatal ECG), no ECG attached |
| Q-7cec58f8 | Image-dependent (instrument identification), no image present |
| Q-2557f818 | Image-dependent (developmental activity), no image or description |
| Q-9506cbfe | Image-dependent (child abuse findings), no image present |
| Q-ac1de03d | Image-dependent (lip lesions), no image present |
| Q-2ff4f881 | Factually wrong key (diaphragmatic hernia as cause of double bubble sign) AND contradicts Q-1d39a2d3 |
| Q-75936207 | Factually wrong key (conflates adult polycythemia vera cutoff with neonatal polycythemia) |
| Q-310f631c | Structurally incoherent: incomplete sentence fragments as options, "All of the above" correct answer |
| Q-ac4552e2 | Correct answer (hydrocortisone) absent from options; "None of these" is not an acceptable key |
| Q-5b446edf | "All of the above" correct answer, Bloom's 1, easy-flagged |
| Q-44c6af91 | Fabricated distractors (non-existent syndromes), Bloom's 1 trivial recall |
| Q-eb9cda41 | Pure numeric threshold recall (ELBW definition), no clinical context |
| Q-562076b0 | Pure numeric threshold recall (apnea definition), no clinical context |
| Q-b4308e8a | Pure numeric threshold recall (hypothermia grades), confusingly similar options |
| Q-b5e0c769 | Isolated physiological trivia (neonatal cardiac output in ml/kg/min), no clinical relevance |
| Q-f3054f5a | Pure eponym recall (Caffey disease), Bloom's 1, no clinical scenario |
| Q-a73eac1d | Pure eponym recall (Koplik spots = measles), Bloom's 1, no clinical context |
| Q-c050f251 | Pure nomenclature recall (Trisomy 13 = Patau), Bloom's 1, no clinical value at PG level |
| Q-d9ab69a8 | Trivially obvious single-fact recall (Down |