# Tasks: RAG Retrieval Quality Improvements **Input**: Design documents from `/specs/004-rag-retrieval-quality/` **Prerequisites**: plan.md ✅, spec.md ✅, research.md ✅, data-model.md ✅, contracts/ ✅ **Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story. **Tests**: Not requested in spec — no test tasks generated. ## Format: `[ID] [P?] [Story] Description` - **[P]**: Can run in parallel (different files, no dependencies) - **[Story]**: Which user story this task belongs to (US1, US2, US3) --- ## Phase 1: Setup (Shared Infrastructure) **Purpose**: No new project structure needed — services are added to the existing `retrieval/` package. This phase is a single verification step. - [x] T001 Verify active branch is `004-rag-retrieval-quality` and `backend/src/main/java/com/aiteacher/retrieval/` exists --- ## Phase 2: Foundational (Blocking Prerequisites) **Purpose**: Two lightweight value objects shared by both user stories. **⚠️ CRITICAL**: Both user story phases depend on these records being present. - [x] T002 Create `ExpandedQuery` record in `backend/src/main/java/com/aiteacher/retrieval/ExpandedQuery.java` with fields `String original` and `String rewritten` - [x] T003 [P] Create `LabelledContext` record in `backend/src/main/java/com/aiteacher/retrieval/LabelledContext.java` with fields `Map sectionLabels`, `Map figureLabels`, and `String promptText` **Checkpoint**: Foundation ready — US1 and US2 implementation can begin in parallel --- ## Phase 3: User Story 1 — Accurate Retrieval Despite Different Terminology (Priority: P1) 🎯 MVP **Goal**: Before each retrieval call, rewrite the user's question into clinical terminology so that vector search finds relevant sections even when the user uses lay language. **Independent Test**: Ask "what happens after cutting the skull?" — verify retrieved sections contain content about craniotomy without that word appearing in the query. ### Implementation for User Story 1 - [x] T004 [US1] Create `QueryExpansionService` in `backend/src/main/java/com/aiteacher/retrieval/QueryExpansionService.java`: - Constructor-inject `ChatClient` - Method `expand(String query): ExpandedQuery` - LLM prompt: *"Rewrite the following question using precise medical/surgical terminology as it would appear in a neurosurgery textbook index. Output only the rewritten question, nothing else. Question: {query}"* - Return `new ExpandedQuery(query, rewrittenText)` - Annotate with `@Service` - [x] T005 [US1] Modify `NeurosurgeryRetriever.retrieve()` in `backend/src/main/java/com/aiteacher/retrieval/NeurosurgeryRetriever.java`: - Change method signature from `retrieve(String query, UUID bookId)` to `retrieve(String query, UUID bookId)` — no signature change; just use `query` for vector search (already correct; no change needed here unless query is pre-expanded by caller) - *Note*: expansion is done in ChatService before calling retrieve, so no change to NeurosurgeryRetriever is required - [x] T006 [US1] Modify `ChatService` in `backend/src/main/java/com/aiteacher/chat/ChatService.java`: - Constructor-inject `QueryExpansionService` - In `sendMessage()`, call `queryExpansionService.expand(fullQuestion)` before the retrieval loop - Pass `expandedQuery.rewritten()` to `retriever.retrieve()` instead of `fullQuestion` - Keep passing `fullQuestion` (original) to `buildContextPrompt()` so the QUESTION block shown to the model reflects what the user actually asked **Checkpoint**: User Story 1 fully functional — retrieval now uses clinically rewritten queries --- ## Phase 4: User Story 2 — Grounded Citation in Generated Answers (Priority: P1) **Goal**: Tag all retrieved sections and figures with short ref-labels (`[S1]`, `[F1]`…) in the prompt, instruct the model to cite only those labels, then post-process the answer to strip any citation referencing a label that was not provided. **Independent Test**: Trigger a question where only sections S1–S3 are retrieved. Verify the generated answer contains no citation outside that set, and the `sources` list in the response carries `refLabel` fields. ### Implementation for User Story 2 - [x] T007 [US2] Create `CitationValidatorService` in `backend/src/main/java/com/aiteacher/retrieval/CitationValidatorService.java`: - Annotate with `@Service` - Method `validate(String generatedAnswer, Set validLabels): String` - Scan `generatedAnswer` for occurrences of `[Sn]` and `[Fn]` patterns using a regex like `\[(S|F)\d+\]` - Remove (or replace with empty string) any match whose label is not in `validLabels` - Return the cleaned answer text - [x] T008 [US2] Modify `ChatService.buildContextPrompt()` in `backend/src/main/java/com/aiteacher/chat/ChatService.java`: - Change signature to return `LabelledContext` instead of `String` - Assign sequential labels: sections get `S1`, `S2`, …; figures get `F1`, `F2`, … - Prefix each section block with its label: `[S1] Section Title, p.N\n{fullText}\n\n` - Prefix each figure line with its label: `[F1] Fig. X (p.N): caption` - Populate `sectionLabels` and `figureLabels` maps in the returned `LabelledContext` - Store the full formatted prompt in `LabelledContext.promptText()` - [x] T009 [US2] Update system prompt constant in `backend/src/main/java/com/aiteacher/chat/ChatService.java`: - Replace the citation rule *"Cite sources for each major point (book title and page number from the context)"* with: *"Cite claims using ONLY the reference labels provided in the context (e.g. [S1], [F2]). Do not invent page numbers, section titles, or labels not present in the CONTEXT block."* - [x] T010 [US2] Wire `CitationValidatorService` into `ChatService.sendMessage()` in `backend/src/main/java/com/aiteacher/chat/ChatService.java`: - Constructor-inject `CitationValidatorService` - After the `chatClient.prompt()...call().content()` call, pass `assistantContent` and the label set from `LabelledContext` to `citationValidatorService.validate()` - Use the validated string as `assistantContent` going forward - [x] T011 [US2] Modify `buildSources()` in `backend/src/main/java/com/aiteacher/chat/ChatService.java`: - Accept the `LabelledContext` (or its two maps) as an additional parameter - Add `"refLabel"` entry to each source map: e.g. `source.put("refLabel", "S1")` for sections, `source.put("refLabel", "F1")` for figures - Keep all other existing fields unchanged - [x] T012 [US2] Update `sendMessage()` call chain in `backend/src/main/java/com/aiteacher/chat/ChatService.java` to thread `LabelledContext` through steps T008–T011: - `LabelledContext ctx = buildContextPrompt(fullQuestion, allSections, allFigures)` - Pass `ctx.promptText()` to the LLM call - Pass `ctx` label maps to `validate()` and `buildSources()` **Checkpoint**: User Stories 1 and 2 both fully functional — queries are expanded, citations are grounded --- ## Phase 5: User Story 3 — User Visibility into Retrieval Confidence (Priority: P2) **Goal**: The answer text contains `[S1]`-style labels (after US2). This phase exposes them in the frontend so users can see which claim maps to which source card. **Independent Test**: Send a question, receive an answer with inline `[S1]` labels visible in the rendered text, and confirm clicking/hovering the label highlights the corresponding source card. **Note**: Per research.md, the backend is already complete after US1+US2. US3 is a frontend-only UX enhancement. ### Implementation for User Story 3 - [x] T013 [US3] Modify `ChatMessage.vue` in `frontend/src/components/ChatMessage.vue`: - Parse the answer text for `[Sn]` and `[Fn]` citation labels using a regex - Render each label as a styled inline badge (e.g. `[S1]`) - When a badge is clicked or hovered, highlight the corresponding source card (match by `source.refLabel`) - [x] T014 [US3] Update source card rendering in `frontend/src/components/ChatMessage.vue`: - Add a `data-ref-label` attribute to each source card element so it can be targeted by the citation badge interaction - Apply a visual highlight style (CSS class) when the card is active **Checkpoint**: All three user stories functional — full end-to-end quality improvements delivered --- ## Phase 6: User Story 4 — Topic Summary Persistence & History (user-requested) **Goal**: Every generated topic summary is saved to the database. When a topic is selected the UI shows a numbered history list; the student can view any past summary or generate a new one. **Independent Test**: Generate a summary for "Intracranial Aneurysms", reload the page, click the topic — verify "Summary #1" appears. Generate again — verify "Summary #2" appears. Click "Summary #1" — verify the original text loads without regeneration. - [x] T018 Create Flyway migration `backend/src/main/resources/db/migration/V6__topic_summary.sql` — table `topic_summary` with columns: `id UUID PRIMARY KEY DEFAULT gen_random_uuid()`, `topic_id VARCHAR(100) NOT NULL`, `summary_number INT NOT NULL`, `summary TEXT NOT NULL`, `sources_json TEXT NOT NULL`, `generated_at TIMESTAMPTZ NOT NULL` - [x] T019 [P] [US4] Create `TopicSummaryEntity.java` in `backend/src/main/java/com/aiteacher/topic/TopicSummaryEntity.java` — JPA `@Entity` mapped to table `topic_summary`; fields: `@Id UUID id`, `String topicId`, `int summaryNumber`, `String summary`, `String sourcesJson`, `Instant generatedAt`; no-arg + all-args constructor - [x] T02X [P] [US4] Create `SavedSummaryItem.java` record in `backend/src/main/java/com/aiteacher/topic/SavedSummaryItem.java` — fields: `UUID id`, `int summaryNumber`, `Instant generatedAt` (list-view DTO, no full text) - [x] T02X [US4] Create `TopicSummaryRepository.java` in `backend/src/main/java/com/aiteacher/topic/TopicSummaryRepository.java` — `extends JpaRepository`; add `List findByTopicIdOrderBySummaryNumberAsc(String topicId)` and `long countByTopicId(String topicId)` - [x] T02X [US4] Modify `TopicSummaryResponse.java` in `backend/src/main/java/com/aiteacher/topic/TopicSummaryResponse.java` — add fields `UUID id` and `int summaryNumber` to the record components - [x] T02X [US4] Modify `TopicSummaryService.java` in `backend/src/main/java/com/aiteacher/topic/TopicSummaryService.java` — inject `TopicSummaryRepository` and `ObjectMapper`; at end of `generateSummary()` compute `summaryNumber = (int) repository.countByTopicId(topicId) + 1`, persist a `TopicSummaryEntity` (serialise `sources` list to JSON via `objectMapper.writeValueAsString()`), and include `id` + `summaryNumber` in the returned `TopicSummaryResponse`; add `List listSummaries(String topicId)` and `TopicSummaryResponse getSummary(UUID summaryId)` methods - [x] T02X [US4] Modify `TopicController.java` in `backend/src/main/java/com/aiteacher/topic/TopicController.java` — add `@GetMapping("/{id}/summaries")` returning `List` (delegates to `listSummaries`); add `@GetMapping("/{id}/summaries/{summaryId}")` returning `TopicSummaryResponse` (delegates to `getSummary`); both return 404 via `NoSuchElementException` when topic or summary not found - [x] T02X [US4] Modify `topicStore.ts` in `frontend/src/stores/topicStore.ts` — add state `summaryList: SavedSummaryItem[]`; add `fetchSummaries(topicId)` action calling `GET /api/v1/topics/{topicId}/summaries`; add `fetchSummaryDetail(topicId, summaryId)` action calling `GET /api/v1/topics/{topicId}/summaries/{summaryId}` and setting `activeSummary`; clear `summaryList` when a different topic is selected - [x] T02X [US4] Modify `TopicsView.vue` in `frontend/src/views/TopicsView.vue` — when a topic card is clicked: (1) call `topicStore.fetchSummaries(topicId)` first; (2) if summaries exist, display a summary history list showing chips "Summary #1 · [date]", "Summary #2 · [date]", … + a "Generate New" button; (3) clicking a chip calls `fetchSummaryDetail()` and renders the saved summary in the existing panel; (4) clicking "Generate New" calls `handleGenerate()` then re-calls `fetchSummaries()` to refresh the list; (5) if no summaries exist, show only the "Generate Summary" button (current behaviour) **Checkpoint**: Summary persistence fully working end-to-end. US4 independently testable. --- ## Phase 7: Polish & Cross-Cutting Concerns **Purpose**: Constitution IV compliance and cleanup. - [x] T027 Update `README.md` Mermaid architecture diagram to add `QueryExpansionService` and `CitationValidatorService` to the chat pipeline flow, and the `topic_summary` table to the data diagram (required by Constitution Principle IV — must be in the same PR) - [x] T028 [P] Log the expanded query at DEBUG level in `QueryExpansionService` (e.g. `log.debug("Query expanded: '{}' → '{}'", original, rewritten)`) for observability - [x] T029 [P] Log stripped citation labels at WARN level in `CitationValidatorService` when any labels are removed (e.g. `log.warn("Stripped hallucinated citations: {}", removedLabels)`) --- ## Dependencies & Execution Order ### Phase Dependencies - **Phase 1 (Setup)**: No dependencies — start immediately - **Phase 2 (Foundational)**: Depends on Phase 1 — blocks all user story phases - **Phase 3 (US1)**: Depends on Phase 2 (needs `ExpandedQuery`) - **Phase 4 (US2)**: Depends on Phase 2 (needs `LabelledContext`); can run in parallel with Phase 3 - **Phase 5 (US3)**: Depends on Phase 4 (needs `refLabel` in sources) - **Phase 6 (US4)**: No dependency on Phase 2 for the migration (T018); entity/service work (T019+) depends on T018 - **Phase 7 (Polish)**: Depends on all implementation phases complete ### User Story Dependencies - **User Story 1 (P1)**: Depends on Phase 2 only — no dependency on US2 or US3 - **User Story 2 (P1)**: Depends on Phase 2 only — can run in parallel with US1 - **User Story 3 (P2)**: Depends on US2 (needs `refLabel` in the API response) - **User Story 4**: Independent of US1–US3 — can start immediately after T018 migration ### Within Each User Story - T004 → T006 (QueryExpansionService must exist before ChatService wiring) - T007 → T010 → T012 (CitationValidatorService → wire into sendMessage → thread context) - T008 → T012 (LabelledContext must be built before threading through) - T013 → T014 (badge rendering before card targeting) ### Parallel Opportunities - T002 and T003 (Phase 2) can run in parallel — different files - Phase 3 (US1) and Phase 4 (US2) can run in parallel after Phase 2 — all different files - T015, T016, T017 (Polish) can run in parallel — different files --- ## Parallel Example: US1 + US2 ``` After Phase 2 completes: Track A (US1): T004 — Create QueryExpansionService T005 — (no change to NeurosurgeryRetriever) T006 — Wire into ChatService Track B (US2): T007 — Create CitationValidatorService T008 — Modify buildContextPrompt() → LabelledContext T009 — Update system prompt T010 — Wire CitationValidatorService into sendMessage() T011 — Add refLabel to buildSources() T012 — Thread LabelledContext through call chain Merge point: Both tracks modify ChatService — coordinate T006 and T012 to avoid conflicts (implement T006 first or use feature branches). ``` --- ## Implementation Strategy ### MVP First (User Stories 1 + 2 — both P1) 1. Complete Phase 1: Setup (T001) 2. Complete Phase 2: Foundational (T002, T003) 3. Complete Phase 3: US1 — query expansion (T004–T006) 4. **VALIDATE**: Ask a lay-language question; confirm relevant clinical passages are retrieved 5. Complete Phase 4: US2 — citation grounding (T007–T012) 6. **VALIDATE**: Confirm no `[Sx]` label appears in the answer that wasn't in the retrieved set 7. **STOP and DEMO**: Both P1 stories deliver the core reliability improvements ### Incremental Delivery 1. Phase 1 + 2 → infrastructure ready 2. Phase 3 → vocabulary mismatch fixed → demo-able 3. Phase 4 → citation hallucination fixed → demo-able 4. Phase 5 → citation badges in UI → UX polish 5. Phase 6 → README + logging → PR-ready --- ## Notes - `ChatService` is modified by both US1 (T006) and US2 (T008–T012) — coordinate edits or implement sequentially - `buildContextPrompt()` changes return type from `String` to `LabelledContext` (T008) — update all callers in the same task - The system prompt change (T009) is a one-line string edit inside `ChatService`; no separate class needed - `CitationValidatorService` operates purely on strings — no DB or AI dependency, easy to unit-test manually - US3 frontend tasks (T013–T014) are entirely in `ChatMessage.vue` — no backend change