Files
ai-teacher/specs/004-rag-retrieval-quality/data-model.md
2026-04-07 22:39:28 +02:00

2.8 KiB

Data Model: RAG Retrieval Quality + Topic Summary Persistence

Branch: 004-rag-retrieval-quality | Date: 2026-04-07


New persistent entity: TopicSummaryEntity

Table: topic_summary
Migration: V6__topic_summary.sql

Column Type Notes
id UUID PK gen_random_uuid() default
topic_id VARCHAR(100) NOT NULL FK to topic.id
summary_number INT NOT NULL Sequential per topic (1, 2, 3, …). Set at insert time: COUNT(*) WHERE topic_id = ? + 1.
summary TEXT NOT NULL Full markdown summary text
sources_json TEXT NOT NULL JSON array of SourceReference objects (same structure as TopicSummaryResponse.sources)
generated_at TIMESTAMPTZ NOT NULL UTC timestamp of generation

Constraints: no unique constraint on summary_number (sequential, not concurrent-safe for POC). No FK constraint enforced at DB level (topic ids are static seed data).


In-memory objects (new, from RAG quality work)

ExpandedQuery (value object, not persisted)

Produced by QueryExpansionService for each user message.

Field Type Description
original String The user's literal question
rewritten String Clinically rewritten version used for vector search

LabelledContext (value object, not persisted)

Produced by ChatService.buildContextPrompt() to track the mapping from ref-labels to source entities.

Field Type Description
sectionLabels Map<String, SectionEntity> e.g. {"S1" → SectionEntity, "S2" → SectionEntity}
figureLabels Map<String, FigureEntity> e.g. {"F1" → FigureEntity}
promptText String The fully formatted context prompt including [S1], [F1] tags

New API DTOs

SavedSummaryItem (list view — no full text)

record SavedSummaryItem(UUID id, int summaryNumber, Instant generatedAt) {}

Used in GET /api/v1/topics/{id}/summaries to show the summary history list without transmitting full text.

TopicSummaryResponse (existing, extended)

Adds id (UUID) and summaryNumber (int) fields so the frontend knows which saved record was just created.


Existing entities (unchanged)

Entity Table Change
SectionEntity section None
FigureEntity figure None
Message message sources field gets refLabel key added per entry
ChatSession chat_session None
Book book None
Topic topic None

Message.sources structure (existing, clarified)

After the RAG quality feature each entry includes refLabel:

{
  "type": "TEXT",
  "refLabel": "S1",
  "bookTitle": "Youmans & Winn Neurological Surgery",
  "page": 142,
  "chunkText": "..."
}