Files
ai-teacher/README.md
T
2026-04-18 17:54:54 +02:00

11 KiB

AI Teacher — Neurosurgeon RAG Learning Platform

A web application for neurosurgeons to upload medical textbooks (PDF), have them embedded into a pgvector store, then select from a predefined topic list to receive AI-generated cross-book summaries, and engage in grounded RAG chat.

Architecture

graph TD
    User["Neurosurgeon (Browser)"]
    Login["Login Page\n(username + password form)"]
    FE["Frontend\nVue.js 3 / Vite\n:5173"]
    BE["Backend\nSpring Boot 4 / Spring AI\n:8080"]
    Auth["Spring Security\nHTTP Basic Auth"]
    DB["PostgreSQL + pgvector\n(source of truth)"]
    FS["File Store\nuploads/ (local disk)\nExtracted figure PNGs"]
    LLM["LLM Provider\n(OpenAI)\nEmbeddings + Chat + Vision"]

    User -->|"First visit / unauthenticated"| Login
    Login -->|"POST credentials\n(GET /api/v1/auth/check)"| Auth
    Auth -->|"401 → back to login\n200 → app access"| Login
    Login -->|"Authenticated"| FE
    FE -->|"REST /api/v1/...\n(HTTP Basic on every request)"| Auth
    Auth --> BE
    BE -->|"JDBC — books, chapters,\nsections, figures, refs"| DB
    BE -->|"pgvector — text chunks\n+ figure caption vectors"| DB
    BE -->|"PNG read/write\n(figure extraction)"| FS
    FE -->|"GET /api/v1/figures/**\n(static file serving)"| BE
    BE -->|"Embedding + Chat\n+ Vision (image description)"| LLM

    subgraph "Embedding Pipeline (per PDF upload)"
        EP1["Parse pages → SectionEntity"]
        EP2["Extract images → FigureEntity"]
        EP3["Vision describe → embed caption"]
        EP4["Chunk text → embed chunks"]
        EP5["Link chunks ↔ figures"]
        EP6["LLM enrich chunk\n(entities, facet, summary)\n→ chunk_metadata"]
        EP1 --> EP2
        EP1 --> EP4
        EP2 --> EP3
        EP4 --> EP5
        EP3 --> EP5
        EP4 --> EP6
    end

    subgraph "Retrieval Pipeline (per chat query)"
        RP0["Query expansion\n(QueryExpansionService)\nlay → clinical terms"]
        RP1["Text chunk search (topK=5)"]
        RP2["Figure caption search (topK=3)"]
        RP3["Expand chunks → ±1-page section text"]
        RP4["Fetch linked figures (chunk_figure_ref)"]
        RP5["Merge + deduplicate figures"]
        RP6["Build labelled prompt\n[S1],[F1]… tags"]
        RP7["LLM chat call"]
        RP8["Citation validation\n(CitationValidatorService)\nstrip hallucinated refs"]
        RP0 --> RP1
        RP0 --> RP2
        RP1 --> RP3
        RP1 --> RP4
        RP2 --> RP5
        RP4 --> RP5
        RP3 --> RP6
        RP5 --> RP6
        RP6 --> RP7
        RP7 --> RP8
    end

Concept Retrieval Pipeline (per concept report)

Concept retrieval is an alternative to the semantic-similarity flow above. It uses the LLM-tagged chunk_metadata rows written at indexing time to exhaustively gather every chunk that concerns a concept (e.g. "aneurysm"), bucketed by facet. One synthesis call per facet yields a structured, multi-section report.

sequenceDiagram
    participant User
    participant FE as Frontend
    participant BE as Backend (ConceptReportService)
    participant Retr as ConceptRetriever
    participant DB as chunk_metadata (GIN)
    participant Vec as vector_store
    participant LLM

    User->>FE: Click "Generate Concept Report" on topic
    FE->>BE: POST /api/v1/topics/{id}/concept-reports
    loop per READY book
        BE->>Retr: retrieveByConcept(topicName, bookId)
        Retr->>DB: WHERE entities @> [canonical]
        alt SQL hits found
            DB-->>Retr: chunks grouped by facet
        else no match (typo / synonym)
            Retr->>Vec: similaritySearch topK=30
            Vec-->>Retr: chunk ids
            Retr->>DB: findByChunkIdIn → group by facet
        end
    end
    BE->>BE: merge facets across books, assign global [S#]/[F#]
    loop per non-empty facet
        BE->>LLM: synthesize facet section (focused prompt)
        LLM-->>BE: facet markdown
    end
    BE->>BE: persist concept_report
    BE-->>FE: { facets[], sources[] }
    FE->>User: render facet-labelled report + inline figures

Backfill path for already-embedded books: POST /api/v1/admin/books/{id}/enrich scans vector_store for TEXT chunks missing chunk_metadata rows and enriches them in place. Idempotent — re-running is a no-op.

Marker API Response Structure

The PDF parsing pipeline calls a local Marker server (POST /marker/upload).

Top-level envelope

{
  "format": "json",
  "output": "<JSON-encoded string>"
}

output is a JSON-encoded string (not a nested object) and must be parsed a second time to get the document tree.

Parsed output shape

{
  "children": [ <Page block>, ... ]
}

Block types

Every block shares these fields:

Field Type Notes
id string e.g. /page/0/Picture/2
block_type string see table below
html string rendered HTML; may contain <content-ref>
bbox [x0,y0,x1,y1] bounding box in page coordinates
children array or null nested blocks
images object or null base64 PNG map (leaf image blocks only)
section_hierarchy object heading ancestry

Known block_type values

block_type Category Notes
Page structure Top-level; direct children are the page content
SectionHeader text Section / chapter heading
Text text
TextInlineMath text
ListItem text
Table text
Code text
Equation text
Footnote text
Caption text Usually a child of a *Group block
PageHeader text
PageFooter text
Handwriting text
Picture image Leaf block; images map holds base64 PNG keyed by ID
Figure image Leaf block; same as Picture
PictureGroup container Wraps one Picture + one Caption child
FigureGroup container Wraps one Figure + one Caption child

Image extraction

Images are only present on leaf image blocks (Picture, Figure). Group blocks (PictureGroup, FigureGroup) have images: null — the base64 PNG lives on the child leaf block.

PictureGroup
├── Picture   ← images: { "/page/0/Picture/2": "<base64 PNG>" }
└── Caption   ← html: "<p>Figure 1 — ...</p>"

Stack

  • Backend: Spring Boot 4.0.5 + Spring AI 2.0.0-M4, Java 25, Maven
  • Frontend: Vue.js 3 + Vite + TypeScript + Pinia + Axios
  • Database: PostgreSQL 16 + pgvector extension
  • Auth: HTTP Basic (single shared in-memory user)

Quick Start

See specs/001-neuro-rag-learning/quickstart.md for full instructions.

Local Dev (JVM)

# Start the database
docker compose up -d

# Backend
cd backend
mvn spring-boot:run

# Frontend
cd frontend
npm install
npm run dev

Native Image Build

Produces a GraalVM native binary packaged into a minimal Docker image via Jib.

Prerequisite: GraalVM 25 must be installed and set as JAVA_HOME.

# Install GraalVM 25 CE via sdkman (one-time)
sdk install java 25-graalce
sdk use java 25-graalce

# Build native executable + Docker image (requires Docker daemon)
cd backend
mvn -Pnative package jib:build -DskipTests
mvn -Pnative jib:build -Djib.to.auth.username=admin -Djib.to.auth.password=""

Backend build (buildah)

JVM image (Dockerfile — Eclipse Temurin 21):

buildah build \
  --platform linux/arm64 \
  --tag zot.immich-ad.ovh/ai-teacher-backend:latest \
  backend/

buildah login zot.immich-ad.ovh
buildah push --tls-verify=false zot.immich-ad.ovh/ai-teacher-backend:latest

Native image (Dockerfile.native — GraalVM 25, produces a minimal Debian-slim image):

buildah build \
  --platform linux/arm64 \
  --file backend/Dockerfile.native \
  --tag zot.immich-ad.ovh/ai-teacher-backend-native:latest \
  backend/

buildah push --tls-verify=false zot.immich-ad.ovh/ai-teacher-backend-native:latest

Frontend build

buildah build \
  --platform linux/arm64 \
  --tag zot.immich-ad.ovh/ai-teacher-frontend:latest \
  frontend/
buildah login zot.immich-ad.ovh

Push to the private repository:

buildah push --tls-verify=false zot.immich-ad.ovh/ai-teacher-frontend:latest

Run Native Stack (Docker Compose)

# Copy and fill in secrets
cp .env.example .env
# edit .env — add OPENAI_API_KEY at minimum

# Start PostgreSQL + native backend
docker compose -f docker-compose.native.yml up

App available at http://localhost:8080.

Build Pipeline (Native)

graph LR
    SRC["Source Code\n(Java 25)"]
    AOT["Spring Boot AOT\n(process-aot)"]
    NI["GraalVM native-image\n(native-maven-plugin)"]
    EXE["Native Executable\ntarget/ai-teacher-backend"]
    JIB["Jib\n(jib-native-image-extension)"]
    IMG["Docker Image\nai-teacher-backend:latest\n(distroless base)"]

    SRC --> AOT
    AOT --> NI
    NI --> EXE
    EXE --> JIB
    JIB --> IMG

Environment Variables

Backend

Variable Required Description
OPENAI_API_KEY Yes OpenAI API key for embeddings and chat
APP_PASSWORD Yes Shared password for HTTP Basic auth
DB_URL Yes JDBC URL, e.g. jdbc:postgresql://localhost:5432/aiteacher
DB_USERNAME Yes Database username
DB_PASSWORD Yes Database password
FIGURE_STORAGE_PATH No Base path for uploaded PDFs and extracted figures (default: ./uploads)
UPLOAD_ENABLED No Set to false to disable the book upload endpoint (default: true)
DELETE_ENABLED No Set to false to disable the book delete endpoint (default: true)

Frontend

Variable Required Description
VITE_API_URL No Backend API base URL (default: /api/v1)
VITE_APP_PASSWORD Yes Shared password for HTTP Basic auth (must match APP_PASSWORD)
VITE_UPLOAD_ENABLED No Set to false to hide the upload UI (default: true)
VITE_DELETE_ENABLED No Set to false to hide the delete button (default: true)