first implementation - image/drawing integration

2026-04-04 12:56:56 +02:00
parent fc5b22fba1
commit 5acfdd33c1
42 changed files with 2854 additions and 151 deletions
@@ -0,0 +1,86 @@
+# Quickstart: Enhanced Embedding with Image Parsing and Metadata
+
+**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03
+
+---
+
+## Prerequisites
+
+- Docker Compose running (PostgreSQL + pgvector)
+- OpenAI API key set in `backend/src/main/resources/application.properties` or as env var `OPENAI_API_KEY`
+- Java 25 + Maven on PATH
+
+---
+
+## New Configuration
+
+Add to `backend/src/main/resources/application.properties`:
+
+```properties
+# Figure storage
+app.figure-storage.base-path=./uploads
+app.figure-storage.min-image-size-px=100
+```
+
+The `uploads/figures/` directory is created automatically on first use. Add it to `.gitignore`.
+
+---
+
+## Database Migration
+
+Two new Flyway migrations run automatically on startup:
+
+- `V4__document_hierarchy.sql` — adds `chapter` and `section` tables
+- `V5__figures_and_refs.sql` — adds `figure` and `chunk_figure_ref` tables
+
+No manual DB setup needed.
+
+---
+
+## Re-embedding Existing Books
+
+Books embedded by feature 001 (text-only) remain functional for text queries. To add image
+support, trigger a re-embed:
+
+```bash
+curl -X POST http://localhost:8080/api/v1/books/{bookId}/reembed \
+  -u admin:password
+```
+
+The book transitions to `PROCESSING`, old chunks and figures are deleted, and the new
+image-aware pipeline runs. Status can be polled via `GET /api/v1/books`.
+
+---
+
+## Verifying Image Extraction
+
+1. Upload a PDF with diagrams: `POST /api/v1/books/upload`
+2. Wait for `status: "READY"` via `GET /api/v1/books`
+3. List figures: `GET /api/v1/books/{id}/figures` — should return at least one entry per image page
+4. Ask a diagram-specific question in chat — response `sources` should include a `type: "FIGURE"` entry
+
+---
+
+## Frontend: Rendering Inline Figures
+
+The assistant message `content` field will contain figure references in the format
+`[Fig. 12-4, p.184]`. The frontend should:
+
+1. Parse `[Fig. X, p.N]` patterns in assistant message text
+2. Look up the matching entry in `sources` where `type === "FIGURE"`
+3. Render the figure inline using the `imageUrl` field
+
+---
+
+## Running Tests
+
+```bash
+cd backend
+mvn test
+```
+
+Key new test classes:
+- `FigureExtractionServiceTest` — unit tests for image extraction and classification
+- `NeurosurgeryRetrieverTest` — unit tests for dual-search merge and deduplication
+- `BookEmbeddingServiceIntegrationTest` — integration test: upload PDF with known figures,
+  verify figures appear in `GET /api/v1/books/{id}/figures`