Full-Scope Research Synthesis
Executive summary
This synthesis reflects the full current Indian Medical PG subject set, not just the original five-pilot review. Every subject has now been packetized with a refreshed randomized sample. However, only a subset of subjects currently has manual narrative review. So this document should be read as a full-scope research memo built from two evidence layers:
- directly observed conclusions from the five manually reviewed pilot reports
- statistically suggested hypotheses from the full 21-subject packet and metadata layer
The strongest qualitative evidence still comes from the five pilot writeups. The broader full-scope stats are useful for prioritization and risk detection, but they are not by themselves equivalent to subject-level manual review.
What changed in this publish
- Subjects indexed: 21
- Subjects with published team reports: 5
- Total bank size across indexed subjects: 169690
- Total sampled questions across the full run: 2351
- Total candidate questions sampled across the full run: 2000
- Total gold-reference questions sampled across the full run: 351
The biggest difference from the earlier pilot-only synthesis is that we now have full subject coverage at the stats layer. That means we can point to where the problem is likely largest even before every subject has a hand-reviewed narrative report.
Evidence model
Directly observed from manual review
The five pilot reports provide the strongest conceptual findings currently available:
- Anatomy: low-yield trivia, comparative-anatomy drift, and recall-heavy structure questions
- Physiology: split between trivial recall and over-generated molecular biomedical content
- Pathology: one-line association questions plus forensic spillover
- Pharmacology: flashcard-style drug trivia and classification recall
- Microbiology: strongest good examples, but diluted by vector/species/antigen trivia
These are observed findings, not just inferred patterns.
Statistically suggested from the full-scope run
Across the broader sample, several subjects show heavy Bloom 1 saturation, thin gold-reference coverage, or large bank size without published manual reports. These are risk signals, not proof of identical failure modes. They should be used to decide where the next manual narrative reviews should go.
Bloom's distribution across all sampled questions
- Bloom 1: 987
- Bloom 2: 910
- Bloom 3: 285
- Bloom 4: 147
- Bloom 5: 22
Where trivial recall pressure looks strongest
These are the subjects where the randomized candidate sample is most saturated with Bloom 1 questions:
- Anatomy: 67/100 candidate questions at Bloom 1 (67%)
- Forensic Medicine: 61/100 candidate questions at Bloom 1 (61%)
- Pathology: 60/100 candidate questions at Bloom 1 (60%)
- Biochemistry: 59/100 candidate questions at Bloom 1 (59%)
- Microbiology: 59/100 candidate questions at Bloom 1 (59%)
- Community Medicine: 52/100 candidate questions at Bloom 1 (52%)
This does not prove that those subjects are fully dominated by bad questions. It is a proxy signal showing where one-step factual recall may be crowding out better exam-level material.
Where higher-order material is showing up more often
These subjects currently show a stronger Bloom 3+ share in the randomized candidate sample:
- General Medicine: 46 candidate questions at Bloom 3+ (46%)
- Internal Medicine: 24 candidate questions at Bloom 3+ (24%)
- Pediatrics: 22 candidate questions at Bloom 3+ (22%)
- Surgery: 20 candidate questions at Bloom 3+ (20%)
- Obstetrics and Gynecology: 19 candidate questions at Bloom 3+ (19%)
- Dermatology: 18 candidate questions at Bloom 3+ (18%)
This is only a tentative signal. Higher Bloom metadata does not automatically mean higher-quality questions, stronger distractors, or better exam relevance. It is best read as: these subjects may contain more material worth manually inspecting for fix rather than mass disable.
Gold-reference coverage caveat
Some subjects have much thinner benchmark/PYQ coverage in the current packet than others:
- General Medicine: benchmark 0, PYQ 0
- Other: benchmark 0, PYQ 0
- Radiology: benchmark 0, PYQ 12
- Dermatology: benchmark 3, PYQ 12
This matters because weak gold coverage lowers confidence when we compare that subject’s generic bank against the target standard.
How to read this report correctly
- Use the five pilot reports for the deepest current conceptual taxonomy.
- Use this full-scope synthesis for breadth, prioritization, and coverage awareness.
- Do not treat the full-scope stats layer as a substitute for manual subject review.
Subject coverage table
| Subject | Total bank | Total sampled | Candidate sampled | Gold sampled | Published reports |
|---|---|---|---|---|---|
| Anatomy | 13876 | 120 | 100 | 20 | 1 |
| Anesthesiology | 3585 | 116 | 100 | 16 | 0 |
| Biochemistry | 10646 | 120 | 100 | 20 | 0 |
| Community Medicine | 10989 | 120 | 100 | 20 | 0 |
| Dermatology | 3237 | 115 | 100 | 15 | 0 |
| ENT | 4280 | 116 | 100 | 16 | 0 |
| Forensic Medicine | 5504 | 118 | 100 | 18 | 0 |
| General Medicine | 218 | 100 | 100 | 0 | 0 |
| Internal Medicine | 18340 | 120 | 100 | 20 | 0 |
| Microbiology | 11104 | 120 | 100 | 20 | 1 |
| Obstetrics and Gynecology | 10364 | 120 | 100 | 20 | 0 |
| Ophthalmology | 6703 | 120 | 100 | 20 | 0 |
| Orthopaedics | 4052 | 116 | 100 | 16 | 0 |
| Other | 0 | 0 | 0 | 0 | 0 |
| Pathology | 12365 | 120 | 100 | 20 | 1 |
| Pediatrics | 7754 | 120 | 100 | 20 | 0 |
| Pharmacology | 14472 | 120 | 100 | 20 | 1 |
| Physiology | 10474 | 120 | 100 | 20 | 1 |
| Psychiatry | 4716 | 118 | 100 | 18 | 0 |
| Radiology | 5382 | 112 | 100 | 12 | 0 |
| Surgery | 11629 | 120 | 100 | 20 | 0 |
Recommended next-wave reporting order
These are the best next candidates for full narrative reports, balancing bank size and likely recall-pressure:
- Internal Medicine: very large bank (18340 questions), candidate Bloom 1 share 34%
- Surgery: large bank (11629 questions), candidate Bloom 1 share 31%
- Community Medicine: large bank (10989 questions), candidate Bloom 1 share 52%
- Biochemistry: large bank (10646 questions), candidate Bloom 1 share 59%
- Obstetrics and Gynecology: large bank (10364 questions), candidate Bloom 1 share 39%
- Pediatrics: mid-sized bank (7754 questions), candidate Bloom 1 share 35%
- Ophthalmology: mid-sized bank (6703 questions), candidate Bloom 1 share 47%
- Forensic Medicine: mid-sized bank (5504 questions), candidate Bloom 1 share 61%
This is a prioritization heuristic, not a validated ranking model.
Operational takeaway
- The site now reflects full subject coverage at the packet and stats layer.
- The pilot reports remain the most trustworthy conceptual analysis.
- The full-scope synthesis now makes it clearer where to aim the next narrative reporting pass instead of treating all remaining subjects as equal.
Recommendation
Use the full-scope synthesis page to prioritize the next subject wave, but continue to treat final content policy as concept-first:
- disable with highest confidence where Bloom 1 dominance and low-yield fact patterns are obvious
- prefer fix/rewrite where a subject already shows meaningful Bloom 3+ presence
- be cautious in subjects with thin benchmark/PYQ gold coverage