# Data Model: RAG Retrieval Quality + Topic Summary Persistence **Branch**: `004-rag-retrieval-quality` | **Date**: 2026-04-07 --- ## New persistent entity: TopicSummaryEntity **Table**: `topic_summary` **Migration**: `V6__topic_summary.sql` | Column | Type | Notes | |--------|------|-------| | `id` | `UUID` PK | `gen_random_uuid()` default | | `topic_id` | `VARCHAR(100)` NOT NULL | FK to `topic.id` | | `summary_number` | `INT` NOT NULL | Sequential per topic (1, 2, 3, …). Set at insert time: `COUNT(*) WHERE topic_id = ? + 1`. | | `summary` | `TEXT` NOT NULL | Full markdown summary text | | `sources_json` | `TEXT` NOT NULL | JSON array of `SourceReference` objects (same structure as `TopicSummaryResponse.sources`) | | `generated_at` | `TIMESTAMPTZ` NOT NULL | UTC timestamp of generation | **Constraints**: no unique constraint on `summary_number` (sequential, not concurrent-safe for POC). No FK constraint enforced at DB level (topic ids are static seed data). --- ## In-memory objects (new, from RAG quality work) ### ExpandedQuery (value object, not persisted) Produced by `QueryExpansionService` for each user message. | Field | Type | Description | |-------|------|-------------| | `original` | `String` | The user's literal question | | `rewritten` | `String` | Clinically rewritten version used for vector search | --- ### LabelledContext (value object, not persisted) Produced by `ChatService.buildContextPrompt()` to track the mapping from ref-labels to source entities. | Field | Type | Description | |-------|------|-------------| | `sectionLabels` | `Map` | e.g. `{"S1" → SectionEntity, "S2" → SectionEntity}` | | `figureLabels` | `Map` | e.g. `{"F1" → FigureEntity}` | | `promptText` | `String` | The fully formatted context prompt including `[S1]`, `[F1]` tags | --- ## New API DTOs ### SavedSummaryItem (list view — no full text) ```java record SavedSummaryItem(UUID id, int summaryNumber, Instant generatedAt) {} ``` Used in `GET /api/v1/topics/{id}/summaries` to show the summary history list without transmitting full text. ### TopicSummaryResponse (existing, extended) Adds `id` (UUID) and `summaryNumber` (int) fields so the frontend knows which saved record was just created. --- ## Existing entities (unchanged) | Entity | Table | Change | |--------|-------|--------| | `SectionEntity` | `section` | None | | `FigureEntity` | `figure` | None | | `Message` | `message` | `sources` field gets `refLabel` key added per entry | | `ChatSession` | `chat_session` | None | | `Book` | `book` | None | | `Topic` | `topic` | None | --- ## Message.sources structure (existing, clarified) After the RAG quality feature each entry includes `refLabel`: ```json { "type": "TEXT", "refLabel": "S1", "bookTitle": "Youmans & Winn Neurological Surgery", "page": 142, "chunkText": "..." } ```