first implementation - image/drawing integration
This commit is contained in:
+12
@@ -1,3 +1,15 @@
|
|||||||
|
# Runtime uploads (extracted figures)
|
||||||
|
uploads/
|
||||||
|
|
||||||
|
# Java build
|
||||||
|
target/
|
||||||
|
*.class
|
||||||
|
*.jar
|
||||||
|
|
||||||
|
# Node
|
||||||
|
node_modules/
|
||||||
|
dist/
|
||||||
|
|
||||||
# OS
|
# OS
|
||||||
.DS_Store
|
.DS_Store
|
||||||
Thumbs.db
|
Thumbs.db
|
||||||
|
|||||||
@@ -1,8 +1,10 @@
|
|||||||
# ai-teacher Development Guidelines
|
# ai-teacher Development Guidelines
|
||||||
|
|
||||||
Auto-generated from all feature plans. Last updated: 2026-03-31
|
Auto-generated from all feature plans. Last updated: 2026-04-03
|
||||||
|
|
||||||
## Active Technologies
|
## Active Technologies
|
||||||
|
- Java 25 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (embeddings + chat), PDFBox (via Spring AI PDF reader dependency) (002-image-aware-embedding)
|
||||||
|
- PostgreSQL (JPA + Flyway), pgvector (Spring AI `VectorStore`), local file system (extracted images — `/uploads/figures/`) (002-image-aware-embedding)
|
||||||
|
|
||||||
- Java 21 (backend), TypeScript / Node 20 (frontend) (001-neuro-rag-learning)
|
- Java 21 (backend), TypeScript / Node 20 (frontend) (001-neuro-rag-learning)
|
||||||
|
|
||||||
@@ -22,6 +24,7 @@ npm test && npm run lint
|
|||||||
Java 21 (backend), TypeScript / Node 20 (frontend): Follow standard conventions
|
Java 21 (backend), TypeScript / Node 20 (frontend): Follow standard conventions
|
||||||
|
|
||||||
## Recent Changes
|
## Recent Changes
|
||||||
|
- 002-image-aware-embedding: Added Java 25 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (embeddings + chat), PDFBox (via Spring AI PDF reader dependency)
|
||||||
|
|
||||||
- 001-neuro-rag-learning: Added Java 21 (backend), TypeScript / Node 20 (frontend)
|
- 001-neuro-rag-learning: Added Java 21 (backend), TypeScript / Node 20 (frontend)
|
||||||
|
|
||||||
|
|||||||
@@ -11,13 +11,45 @@ graph TD
|
|||||||
User["Neurosurgeon (Browser)"]
|
User["Neurosurgeon (Browser)"]
|
||||||
FE["Frontend\nVue.js 3 / Vite\n:5173"]
|
FE["Frontend\nVue.js 3 / Vite\n:5173"]
|
||||||
BE["Backend\nSpring Boot 4 / Spring AI\n:8080"]
|
BE["Backend\nSpring Boot 4 / Spring AI\n:8080"]
|
||||||
DB["PostgreSQL + pgvector\n(provided)"]
|
DB["PostgreSQL + pgvector\n(source of truth)"]
|
||||||
LLM["LLM Provider\n(OpenAI / configurable)"]
|
FS["File Store\nuploads/ (local disk)\nExtracted figure PNGs"]
|
||||||
|
LLM["LLM Provider\n(OpenAI)\nEmbeddings + Chat + Vision"]
|
||||||
|
|
||||||
User -->|HTTP| FE
|
User -->|HTTP| FE
|
||||||
FE -->|REST /api/v1/...| BE
|
FE -->|REST /api/v1/...| BE
|
||||||
BE -->|JDBC / pgvector| DB
|
BE -->|"JDBC — books, chapters,\nsections, figures, refs"| DB
|
||||||
BE -->|Embedding + Chat API| LLM
|
BE -->|"pgvector — text chunks\n+ figure caption vectors"| DB
|
||||||
|
BE -->|"PNG read/write\n(figure extraction)"| FS
|
||||||
|
FE -->|"GET /api/v1/figures/**\n(static file serving)"| BE
|
||||||
|
BE -->|"Embedding + Chat\n+ Vision (image description)"| LLM
|
||||||
|
|
||||||
|
subgraph "Embedding Pipeline (per PDF upload)"
|
||||||
|
EP1["Parse pages → SectionEntity"]
|
||||||
|
EP2["Extract images → FigureEntity"]
|
||||||
|
EP3["Vision describe → embed caption"]
|
||||||
|
EP4["Chunk text → embed chunks"]
|
||||||
|
EP5["Link chunks ↔ figures"]
|
||||||
|
EP1 --> EP2
|
||||||
|
EP1 --> EP4
|
||||||
|
EP2 --> EP3
|
||||||
|
EP4 --> EP5
|
||||||
|
EP3 --> EP5
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph "Retrieval Pipeline (per chat query)"
|
||||||
|
RP1["Text chunk search (topK=5)"]
|
||||||
|
RP2["Figure caption search (topK=3)"]
|
||||||
|
RP3["Expand chunks → full section text"]
|
||||||
|
RP4["Fetch linked figures (chunk_figure_ref)"]
|
||||||
|
RP5["Merge + deduplicate figures"]
|
||||||
|
RP6["Build LLM prompt + call"]
|
||||||
|
RP1 --> RP3
|
||||||
|
RP1 --> RP4
|
||||||
|
RP2 --> RP5
|
||||||
|
RP4 --> RP5
|
||||||
|
RP3 --> RP6
|
||||||
|
RP5 --> RP6
|
||||||
|
end
|
||||||
```
|
```
|
||||||
|
|
||||||
## Stack
|
## Stack
|
||||||
@@ -56,3 +88,4 @@ npm run dev
|
|||||||
| `DB_URL` | Yes | JDBC URL, e.g. `jdbc:postgresql://localhost:5432/aiteacher` |
|
| `DB_URL` | Yes | JDBC URL, e.g. `jdbc:postgresql://localhost:5432/aiteacher` |
|
||||||
| `DB_USERNAME` | Yes | Database username |
|
| `DB_USERNAME` | Yes | Database username |
|
||||||
| `DB_PASSWORD` | Yes | Database password |
|
| `DB_PASSWORD` | Yes | Database password |
|
||||||
|
| `FIGURE_STORAGE_PATH` | No | Base path for uploaded PDFs and extracted figures (default: `./uploads`) |
|
||||||
|
|||||||
+8
-1
@@ -95,12 +95,19 @@
|
|||||||
<artifactId>spring-ai-advisors-vector-store</artifactId>
|
<artifactId>spring-ai-advisors-vector-store</artifactId>
|
||||||
</dependency>
|
</dependency>
|
||||||
|
|
||||||
<!-- Spring AI — PDF document reader -->
|
<!-- Spring AI — PDF document reader (includes PDFBox transitively) -->
|
||||||
<dependency>
|
<dependency>
|
||||||
<groupId>org.springframework.ai</groupId>
|
<groupId>org.springframework.ai</groupId>
|
||||||
<artifactId>spring-ai-pdf-document-reader</artifactId>
|
<artifactId>spring-ai-pdf-document-reader</artifactId>
|
||||||
</dependency>
|
</dependency>
|
||||||
|
|
||||||
|
<!-- PDFBox — explicit for image extraction per page -->
|
||||||
|
<dependency>
|
||||||
|
<groupId>org.apache.pdfbox</groupId>
|
||||||
|
<artifactId>pdfbox</artifactId>
|
||||||
|
<version>3.0.3</version>
|
||||||
|
</dependency>
|
||||||
|
|
||||||
<!-- Jackson (JSON) -->
|
<!-- Jackson (JSON) -->
|
||||||
<dependency>
|
<dependency>
|
||||||
<groupId>com.fasterxml.jackson.core</groupId>
|
<groupId>com.fasterxml.jackson.core</groupId>
|
||||||
|
|||||||
@@ -1,5 +1,7 @@
|
|||||||
package com.aiteacher.book;
|
package com.aiteacher.book;
|
||||||
|
|
||||||
|
import com.aiteacher.document.FigureEntity;
|
||||||
|
import com.aiteacher.document.FigureRepository;
|
||||||
import org.springframework.http.HttpStatus;
|
import org.springframework.http.HttpStatus;
|
||||||
import org.springframework.http.ResponseEntity;
|
import org.springframework.http.ResponseEntity;
|
||||||
import org.springframework.web.bind.annotation.*;
|
import org.springframework.web.bind.annotation.*;
|
||||||
@@ -15,9 +17,11 @@ import java.util.UUID;
|
|||||||
public class BookController {
|
public class BookController {
|
||||||
|
|
||||||
private final BookService bookService;
|
private final BookService bookService;
|
||||||
|
private final FigureRepository figureRepository;
|
||||||
|
|
||||||
public BookController(BookService bookService) {
|
public BookController(BookService bookService, FigureRepository figureRepository) {
|
||||||
this.bookService = bookService;
|
this.bookService = bookService;
|
||||||
|
this.figureRepository = figureRepository;
|
||||||
}
|
}
|
||||||
|
|
||||||
@PostMapping(consumes = "multipart/form-data")
|
@PostMapping(consumes = "multipart/form-data")
|
||||||
@@ -46,6 +50,36 @@ public class BookController {
|
|||||||
return ResponseEntity.noContent().build();
|
return ResponseEntity.noContent().build();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@PostMapping("/{id}/reembed")
|
||||||
|
public ResponseEntity<Map<String, Object>> reembed(@PathVariable UUID id) {
|
||||||
|
Book book = bookService.reembed(id);
|
||||||
|
return ResponseEntity.accepted().body(Map.of(
|
||||||
|
"bookId", book.getId(),
|
||||||
|
"status", BookStatus.PROCESSING.name()
|
||||||
|
));
|
||||||
|
}
|
||||||
|
|
||||||
|
@GetMapping("/{id}/figures")
|
||||||
|
public ResponseEntity<List<FigureResponse>> figures(@PathVariable UUID id) {
|
||||||
|
bookService.getById(id); // 404 if not found
|
||||||
|
List<FigureResponse> responses = figureRepository.findAllByBookId(id)
|
||||||
|
.stream()
|
||||||
|
.map(f -> toFigureResponse(id, f))
|
||||||
|
.toList();
|
||||||
|
return ResponseEntity.ok(responses);
|
||||||
|
}
|
||||||
|
|
||||||
|
private FigureResponse toFigureResponse(UUID bookId, FigureEntity f) {
|
||||||
|
String filename = f.getImagePath().substring(f.getImagePath().lastIndexOf('/') + 1);
|
||||||
|
String imageUrl = "/api/v1/figures/" + bookId + "/" + filename;
|
||||||
|
return new FigureResponse(
|
||||||
|
f.getId(), f.getLabel(), f.getCaption(),
|
||||||
|
f.getFigureType().name(), f.getPage(), imageUrl,
|
||||||
|
f.getSectionId(),
|
||||||
|
null // section title not eagerly loaded here
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
private Map<String, Object> toSummaryResponse(Book book) {
|
private Map<String, Object> toSummaryResponse(Book book) {
|
||||||
return Map.of(
|
return Map.of(
|
||||||
"id", book.getId(),
|
"id", book.getId(),
|
||||||
|
|||||||
@@ -1,41 +1,75 @@
|
|||||||
package com.aiteacher.book;
|
package com.aiteacher.book;
|
||||||
|
|
||||||
|
import com.aiteacher.document.*;
|
||||||
|
import com.aiteacher.figure.FigureStorageService;
|
||||||
import org.slf4j.Logger;
|
import org.slf4j.Logger;
|
||||||
import org.slf4j.LoggerFactory;
|
import org.slf4j.LoggerFactory;
|
||||||
import org.springframework.ai.document.Document;
|
import org.springframework.ai.document.Document;
|
||||||
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
|
|
||||||
import org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig;
|
|
||||||
import org.springframework.ai.vectorstore.VectorStore;
|
import org.springframework.ai.vectorstore.VectorStore;
|
||||||
import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
|
import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
|
||||||
import org.springframework.core.io.FileSystemResource;
|
import org.springframework.beans.factory.annotation.Value;
|
||||||
import org.springframework.scheduling.annotation.Async;
|
import org.springframework.scheduling.annotation.Async;
|
||||||
import org.springframework.stereotype.Service;
|
import org.springframework.stereotype.Service;
|
||||||
|
import org.springframework.transaction.annotation.Transactional;
|
||||||
|
|
||||||
import java.nio.file.Path;
|
import java.nio.file.Path;
|
||||||
import java.util.List;
|
import java.time.Instant;
|
||||||
import java.util.UUID;
|
import java.util.*;
|
||||||
import java.util.regex.Pattern;
|
|
||||||
|
|
||||||
@Service
|
@Service
|
||||||
public class BookEmbeddingService {
|
public class BookEmbeddingService {
|
||||||
|
|
||||||
private static final Logger log = LoggerFactory.getLogger(BookEmbeddingService.class);
|
private static final Logger log = LoggerFactory.getLogger(BookEmbeddingService.class);
|
||||||
|
|
||||||
// Pattern to detect diagram/figure captions
|
|
||||||
private static final Pattern CAPTION_PATTERN =
|
|
||||||
Pattern.compile("^(Figure|Fig\\.|Table|Diagram)\\s+[\\d.]+", Pattern.CASE_INSENSITIVE);
|
|
||||||
|
|
||||||
private final VectorStore vectorStore;
|
private final VectorStore vectorStore;
|
||||||
private final BookRepository bookRepository;
|
private final BookRepository bookRepository;
|
||||||
|
|
||||||
public BookEmbeddingService(VectorStore vectorStore, BookRepository bookRepository) {
|
@Value("${app.embedding.batch-size:50}")
|
||||||
|
private int embeddingBatchSize;
|
||||||
|
|
||||||
|
@Value("${app.embedding.batch-delay-ms:1000}")
|
||||||
|
private long embeddingBatchDelayMs;
|
||||||
|
private final PdfStructureParser pdfStructureParser;
|
||||||
|
private final FigureExtractionService figureExtractionService;
|
||||||
|
private final VisionDescriptionService visionDescriptionService;
|
||||||
|
private final TextChunkingService textChunkingService;
|
||||||
|
private final ChunkFigureRefService chunkFigureRefService;
|
||||||
|
private final SectionRepository sectionRepository;
|
||||||
|
private final ChapterRepository chapterRepository;
|
||||||
|
private final FigureRepository figureRepository;
|
||||||
|
private final ChunkFigureRefRepository chunkFigureRefRepository;
|
||||||
|
private final FigureStorageService figureStorageService;
|
||||||
|
|
||||||
|
public BookEmbeddingService(
|
||||||
|
VectorStore vectorStore,
|
||||||
|
BookRepository bookRepository,
|
||||||
|
PdfStructureParser pdfStructureParser,
|
||||||
|
FigureExtractionService figureExtractionService,
|
||||||
|
VisionDescriptionService visionDescriptionService,
|
||||||
|
TextChunkingService textChunkingService,
|
||||||
|
ChunkFigureRefService chunkFigureRefService,
|
||||||
|
SectionRepository sectionRepository,
|
||||||
|
ChapterRepository chapterRepository,
|
||||||
|
FigureRepository figureRepository,
|
||||||
|
ChunkFigureRefRepository chunkFigureRefRepository,
|
||||||
|
FigureStorageService figureStorageService) {
|
||||||
this.vectorStore = vectorStore;
|
this.vectorStore = vectorStore;
|
||||||
this.bookRepository = bookRepository;
|
this.bookRepository = bookRepository;
|
||||||
|
this.pdfStructureParser = pdfStructureParser;
|
||||||
|
this.figureExtractionService = figureExtractionService;
|
||||||
|
this.visionDescriptionService = visionDescriptionService;
|
||||||
|
this.textChunkingService = textChunkingService;
|
||||||
|
this.chunkFigureRefService = chunkFigureRefService;
|
||||||
|
this.sectionRepository = sectionRepository;
|
||||||
|
this.chapterRepository = chapterRepository;
|
||||||
|
this.figureRepository = figureRepository;
|
||||||
|
this.chunkFigureRefRepository = chunkFigureRefRepository;
|
||||||
|
this.figureStorageService = figureStorageService;
|
||||||
}
|
}
|
||||||
|
|
||||||
@Async
|
@Async
|
||||||
public void embedBook(UUID bookId, String bookTitle, Path pdfPath) {
|
public void embedBook(UUID bookId, String bookTitle, Path pdfPath) {
|
||||||
log.info("Starting embedding for book {} ({})", bookId, bookTitle);
|
log.info("Starting image-aware embedding for book {} ({})", bookId, bookTitle);
|
||||||
|
|
||||||
Book book = bookRepository.findById(bookId).orElse(null);
|
Book book = bookRepository.findById(bookId).orElse(null);
|
||||||
if (book == null) {
|
if (book == null) {
|
||||||
@@ -47,29 +81,68 @@ public class BookEmbeddingService {
|
|||||||
book.setStatus(BookStatus.PROCESSING);
|
book.setStatus(BookStatus.PROCESSING);
|
||||||
bookRepository.save(book);
|
bookRepository.save(book);
|
||||||
|
|
||||||
PagePdfDocumentReader reader = new PagePdfDocumentReader(
|
// Step 1: Parse PDF into page-level sections persisted in Postgres
|
||||||
new FileSystemResource(pdfPath.toFile()),
|
List<SectionEntity> sections = pdfStructureParser.parse(bookId, bookTitle, pdfPath);
|
||||||
PdfDocumentReaderConfig.builder()
|
String chapterId = bookId + "-ch1";
|
||||||
.withPagesPerDocument(1)
|
|
||||||
.build()
|
|
||||||
);
|
|
||||||
|
|
||||||
List<Document> pages = reader.get();
|
// Step 2: Build and embed text chunks for all sections in batches
|
||||||
int pageCount = pages.size();
|
List<Document> allChunks = new ArrayList<>();
|
||||||
|
for (SectionEntity section : sections) {
|
||||||
|
List<Document> chunks = textChunkingService.chunk(section, bookTitle);
|
||||||
|
allChunks.addAll(chunks);
|
||||||
|
}
|
||||||
|
embedInBatches(allChunks, bookId);
|
||||||
|
log.info("Embedded {} text chunks for book {}", allChunks.size(), bookId);
|
||||||
|
|
||||||
// Enrich metadata and tag diagram captions
|
// Step 3: Extract images from the PDF, save to file store, persist FigureEntity
|
||||||
List<Document> enriched = pages.stream()
|
List<FigureEntity> figures = figureExtractionService.extract(
|
||||||
.map(doc -> enrichDocument(doc, bookId.toString(), bookTitle))
|
bookId, chapterId, sections, pdfPath);
|
||||||
.toList();
|
|
||||||
|
|
||||||
vectorStore.add(enriched);
|
// Step 4: For each figure, generate vision description and embed caption
|
||||||
|
for (FigureEntity figure : figures) {
|
||||||
|
Path imagePath = figureStorageService.resolve(figure.getImagePath());
|
||||||
|
String description = visionDescriptionService.describe(
|
||||||
|
imagePath, figure.getCaption());
|
||||||
|
|
||||||
|
// Use description as caption fallback if no caption was detected
|
||||||
|
if (figure.getCaption() == null || figure.getCaption().isBlank()) {
|
||||||
|
figure.setCaption(description);
|
||||||
|
figureRepository.save(figure);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Content for embedding = vision description + caption for maximum signal
|
||||||
|
String embeddingContent = description
|
||||||
|
+ (figure.getCaption() != null ? "\n" + figure.getCaption() : "");
|
||||||
|
|
||||||
|
String embeddingId = UUID.randomUUID().toString();
|
||||||
|
Map<String, Object> metadata = buildFigureMetadata(figure, bookTitle, embeddingId);
|
||||||
|
Document figureDoc = new Document(embeddingId, embeddingContent, metadata);
|
||||||
|
vectorStore.add(List.of(figureDoc));
|
||||||
|
|
||||||
|
figure.setCaptionEmbeddingId(UUID.fromString(embeddingId));
|
||||||
|
figureRepository.save(figure);
|
||||||
|
}
|
||||||
|
log.info("Embedded {} figure captions for book {}", figures.size(), bookId);
|
||||||
|
|
||||||
|
// Step 5: Link text chunks to figures via text references
|
||||||
|
for (SectionEntity section : sections) {
|
||||||
|
List<Document> sectionChunks = allChunks.stream()
|
||||||
|
.filter(d -> section.getId().equals(d.getMetadata().get("section_id")))
|
||||||
|
.toList();
|
||||||
|
List<FigureEntity> sectionFigures = figures.stream()
|
||||||
|
.filter(f -> section.getId().equals(f.getSectionId()))
|
||||||
|
.toList();
|
||||||
|
chunkFigureRefService.linkChunksToFigures(
|
||||||
|
sectionChunks, sectionFigures, section.getPageStart());
|
||||||
|
}
|
||||||
|
|
||||||
book.setStatus(BookStatus.READY);
|
book.setStatus(BookStatus.READY);
|
||||||
book.setPageCount(pageCount);
|
book.setPageCount(sections.size());
|
||||||
book.setProcessedAt(java.time.Instant.now());
|
book.setProcessedAt(Instant.now());
|
||||||
bookRepository.save(book);
|
bookRepository.save(book);
|
||||||
|
|
||||||
log.info("Finished embedding book {} — {} pages", bookId, pageCount);
|
log.info("Finished embedding book {} — {} pages, {} figures",
|
||||||
|
bookId, sections.size(), figures.size());
|
||||||
|
|
||||||
} catch (Exception ex) {
|
} catch (Exception ex) {
|
||||||
log.error("Failed to embed book {}", bookId, ex);
|
log.error("Failed to embed book {}", bookId, ex);
|
||||||
@@ -79,40 +152,74 @@ public class BookEmbeddingService {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private Document enrichDocument(Document doc, String bookId, String bookTitle) {
|
@Transactional
|
||||||
String content = doc.getText();
|
public void deleteBookChunks(UUID bookId) {
|
||||||
String chunkType = detectChunkType(content);
|
log.info("Deleting all data for book {}", bookId);
|
||||||
|
try {
|
||||||
|
// Delete chunk-figure refs (by figureId for this book)
|
||||||
|
List<String> figureIds = figureRepository.findAllByBookId(bookId)
|
||||||
|
.stream().map(FigureEntity::getId).toList();
|
||||||
|
if (!figureIds.isEmpty()) {
|
||||||
|
chunkFigureRefRepository.deleteByFigureIdIn(figureIds);
|
||||||
|
}
|
||||||
|
|
||||||
doc.getMetadata().put("book_id", bookId);
|
// Delete figures from Postgres
|
||||||
doc.getMetadata().put("book_title", bookTitle);
|
figureRepository.deleteAllByBookId(bookId);
|
||||||
doc.getMetadata().put("chunk_type", chunkType);
|
|
||||||
|
|
||||||
return doc;
|
// Delete figure files from disk
|
||||||
|
figureStorageService.deleteAll(bookId);
|
||||||
|
|
||||||
|
// Delete sections and chapters from Postgres
|
||||||
|
sectionRepository.deleteAllByBookId(bookId);
|
||||||
|
chapterRepository.deleteAllByBookId(bookId);
|
||||||
|
|
||||||
|
// Delete vector store entries (text chunks + figure embeddings)
|
||||||
|
FilterExpressionBuilder b = new FilterExpressionBuilder();
|
||||||
|
vectorStore.delete(b.eq("book_id", bookId.toString()).build());
|
||||||
|
|
||||||
|
} catch (Exception ex) {
|
||||||
|
log.warn("Error during cleanup for book {}: {}", bookId, ex.getMessage());
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private String detectChunkType(String content) {
|
private void embedInBatches(List<Document> docs, UUID bookId) {
|
||||||
if (content != null) {
|
int total = docs.size();
|
||||||
for (String line : content.split("\\r?\\n")) {
|
for (int i = 0; i < total; i += embeddingBatchSize) {
|
||||||
if (CAPTION_PATTERN.matcher(line.trim()).find()) {
|
List<Document> batch = docs.subList(i, Math.min(i + embeddingBatchSize, total));
|
||||||
return "diagram";
|
vectorStore.add(batch);
|
||||||
|
int batchNum = i / embeddingBatchSize + 1;
|
||||||
|
int totalBatches = (total - 1) / embeddingBatchSize + 1;
|
||||||
|
log.debug("Embedded batch {}/{} for book {}", batchNum, totalBatches, bookId);
|
||||||
|
if (i + embeddingBatchSize < total) {
|
||||||
|
try {
|
||||||
|
Thread.sleep(embeddingBatchDelayMs);
|
||||||
|
} catch (InterruptedException e) {
|
||||||
|
Thread.currentThread().interrupt();
|
||||||
|
log.warn("Embedding batch sleep interrupted for book {}", bookId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
return "text";
|
|
||||||
}
|
}
|
||||||
|
|
||||||
public void deleteBookChunks(UUID bookId) {
|
private Map<String, Object> buildFigureMetadata(FigureEntity figure, String bookTitle,
|
||||||
log.info("Deleting vector chunks for book {}", bookId);
|
String embeddingId) {
|
||||||
try {
|
Map<String, Object> m = new HashMap<>();
|
||||||
FilterExpressionBuilder b = new FilterExpressionBuilder();
|
m.put("type", "FIGURE");
|
||||||
vectorStore.delete(b.eq("book_id", bookId.toString()).build());
|
m.put("book_id", figure.getBookId().toString());
|
||||||
} catch (Exception ex) {
|
m.put("book_title", bookTitle);
|
||||||
log.warn("Could not delete vector chunks for book {}: {}", bookId, ex.getMessage());
|
m.put("chapter_id", figure.getChapterId() != null ? figure.getChapterId() : "");
|
||||||
}
|
m.put("section_id", figure.getSectionId() != null ? figure.getSectionId() : "");
|
||||||
|
m.put("figure_id", figure.getId());
|
||||||
|
m.put("figure_type", figure.getFigureType().name());
|
||||||
|
m.put("image_path", figure.getImagePath());
|
||||||
|
m.put("label", figure.getLabel() != null ? figure.getLabel() : "");
|
||||||
|
m.put("page", figure.getPage());
|
||||||
|
m.put("embedding_id", embeddingId);
|
||||||
|
return m;
|
||||||
}
|
}
|
||||||
|
|
||||||
private String truncate(String message, int maxLength) {
|
private String truncate(String msg, int max) {
|
||||||
if (message == null) return null;
|
if (msg == null) return null;
|
||||||
return message.length() <= maxLength ? message : message.substring(0, maxLength);
|
return msg.length() <= max ? msg : msg.substring(0, max);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,11 +1,13 @@
|
|||||||
package com.aiteacher.book;
|
package com.aiteacher.book;
|
||||||
|
|
||||||
|
import org.springframework.beans.factory.annotation.Value;
|
||||||
import org.springframework.stereotype.Service;
|
import org.springframework.stereotype.Service;
|
||||||
import org.springframework.web.multipart.MultipartFile;
|
import org.springframework.web.multipart.MultipartFile;
|
||||||
|
|
||||||
import java.io.IOException;
|
import java.io.IOException;
|
||||||
import java.nio.file.Files;
|
import java.nio.file.Files;
|
||||||
import java.nio.file.Path;
|
import java.nio.file.Path;
|
||||||
|
import java.nio.file.Paths;
|
||||||
import java.util.List;
|
import java.util.List;
|
||||||
import java.util.NoSuchElementException;
|
import java.util.NoSuchElementException;
|
||||||
import java.util.UUID;
|
import java.util.UUID;
|
||||||
@@ -15,10 +17,15 @@ public class BookService {
|
|||||||
|
|
||||||
private final BookRepository bookRepository;
|
private final BookRepository bookRepository;
|
||||||
private final BookEmbeddingService bookEmbeddingService;
|
private final BookEmbeddingService bookEmbeddingService;
|
||||||
|
private final Path bookStoragePath;
|
||||||
|
|
||||||
public BookService(BookRepository bookRepository, BookEmbeddingService bookEmbeddingService) {
|
public BookService(
|
||||||
|
BookRepository bookRepository,
|
||||||
|
BookEmbeddingService bookEmbeddingService,
|
||||||
|
@Value("${app.figure-storage.base-path:./uploads}") String basePath) {
|
||||||
this.bookRepository = bookRepository;
|
this.bookRepository = bookRepository;
|
||||||
this.bookEmbeddingService = bookEmbeddingService;
|
this.bookEmbeddingService = bookEmbeddingService;
|
||||||
|
this.bookStoragePath = Paths.get(basePath).toAbsolutePath().normalize().resolve("books");
|
||||||
}
|
}
|
||||||
|
|
||||||
public Book upload(MultipartFile file) throws IOException {
|
public Book upload(MultipartFile file) throws IOException {
|
||||||
@@ -28,20 +35,35 @@ public class BookService {
|
|||||||
}
|
}
|
||||||
|
|
||||||
String title = deriveTitle(originalFilename);
|
String title = deriveTitle(originalFilename);
|
||||||
|
|
||||||
Book book = new Book(title, originalFilename, file.getSize());
|
Book book = new Book(title, originalFilename, file.getSize());
|
||||||
book = bookRepository.save(book);
|
book = bookRepository.save(book);
|
||||||
|
|
||||||
// Write to a temp file so the async task can read it
|
// Persist PDF in a stable location for potential re-embedding
|
||||||
Path tempFile = Files.createTempFile("aiteacher-", "-" + book.getId() + ".pdf");
|
Files.createDirectories(bookStoragePath);
|
||||||
file.transferTo(tempFile.toFile());
|
Path pdfPath = bookStoragePath.resolve(book.getId() + ".pdf");
|
||||||
|
file.transferTo(pdfPath.toFile());
|
||||||
|
|
||||||
UUID bookId = book.getId();
|
UUID bookId = book.getId();
|
||||||
Path pdfPath = tempFile;
|
bookEmbeddingService.embedBook(bookId, title, pdfPath);
|
||||||
String bookTitle = title;
|
return book;
|
||||||
|
}
|
||||||
|
|
||||||
bookEmbeddingService.embedBook(bookId, bookTitle, pdfPath);
|
public Book reembed(UUID id) {
|
||||||
|
Book book = bookRepository.findById(id)
|
||||||
|
.orElseThrow(() -> new NoSuchElementException("Book not found."));
|
||||||
|
|
||||||
|
if (book.getStatus() == BookStatus.PROCESSING) {
|
||||||
|
throw new IllegalStateException("Book is already being processed.");
|
||||||
|
}
|
||||||
|
|
||||||
|
Path pdfPath = bookStoragePath.resolve(id + ".pdf");
|
||||||
|
if (!Files.exists(pdfPath)) {
|
||||||
|
throw new IllegalStateException(
|
||||||
|
"Original PDF not found. Please re-upload the book before re-embedding.");
|
||||||
|
}
|
||||||
|
|
||||||
|
bookEmbeddingService.deleteBookChunks(id);
|
||||||
|
bookEmbeddingService.embedBook(id, book.getTitle(), pdfPath);
|
||||||
return book;
|
return book;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -63,14 +85,21 @@ public class BookService {
|
|||||||
}
|
}
|
||||||
|
|
||||||
bookEmbeddingService.deleteBookChunks(id);
|
bookEmbeddingService.deleteBookChunks(id);
|
||||||
|
|
||||||
|
// Delete the stored PDF
|
||||||
|
Path pdfPath = bookStoragePath.resolve(id + ".pdf");
|
||||||
|
try {
|
||||||
|
Files.deleteIfExists(pdfPath);
|
||||||
|
} catch (IOException ex) {
|
||||||
|
// Non-fatal — log only
|
||||||
|
}
|
||||||
|
|
||||||
bookRepository.deleteById(id);
|
bookRepository.deleteById(id);
|
||||||
}
|
}
|
||||||
|
|
||||||
private String deriveTitle(String filename) {
|
private String deriveTitle(String filename) {
|
||||||
// Strip .pdf extension and replace separators with spaces
|
|
||||||
String name = filename.replaceAll("(?i)\\.pdf$", "");
|
String name = filename.replaceAll("(?i)\\.pdf$", "");
|
||||||
name = name.replaceAll("[-_]", " ");
|
name = name.replaceAll("[-_]", " ");
|
||||||
// Capitalise first letter
|
|
||||||
if (!name.isEmpty()) {
|
if (!name.isEmpty()) {
|
||||||
name = Character.toUpperCase(name.charAt(0)) + name.substring(1);
|
name = Character.toUpperCase(name.charAt(0)) + name.substring(1);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -0,0 +1,12 @@
|
|||||||
|
package com.aiteacher.book;
|
||||||
|
|
||||||
|
public record FigureResponse(
|
||||||
|
String figureId,
|
||||||
|
String label,
|
||||||
|
String caption,
|
||||||
|
String figureType,
|
||||||
|
int page,
|
||||||
|
String imageUrl,
|
||||||
|
String sectionId,
|
||||||
|
String sectionTitle
|
||||||
|
) {}
|
||||||
@@ -3,22 +3,16 @@ package com.aiteacher.chat;
|
|||||||
import com.aiteacher.book.BookRepository;
|
import com.aiteacher.book.BookRepository;
|
||||||
import com.aiteacher.book.BookStatus;
|
import com.aiteacher.book.BookStatus;
|
||||||
import com.aiteacher.book.NoKnowledgeSourceException;
|
import com.aiteacher.book.NoKnowledgeSourceException;
|
||||||
|
import com.aiteacher.document.FigureEntity;
|
||||||
|
import com.aiteacher.document.SectionEntity;
|
||||||
|
import com.aiteacher.retrieval.NeurosurgeryRetriever;
|
||||||
|
import com.aiteacher.retrieval.RetrievalResult;
|
||||||
import org.slf4j.Logger;
|
import org.slf4j.Logger;
|
||||||
import org.slf4j.LoggerFactory;
|
import org.slf4j.LoggerFactory;
|
||||||
import org.springframework.ai.chat.client.ChatClient;
|
import org.springframework.ai.chat.client.ChatClient;
|
||||||
import org.springframework.ai.chat.client.advisor.vectorstore.QuestionAnswerAdvisor;
|
|
||||||
import org.springframework.ai.chat.model.ChatResponse;
|
|
||||||
import org.springframework.ai.document.Document;
|
|
||||||
import org.springframework.ai.vectorstore.SearchRequest;
|
|
||||||
import org.springframework.ai.vectorstore.VectorStore;
|
|
||||||
import org.springframework.stereotype.Service;
|
import org.springframework.stereotype.Service;
|
||||||
|
|
||||||
import java.util.ArrayList;
|
import java.util.*;
|
||||||
import java.util.HashMap;
|
|
||||||
import java.util.List;
|
|
||||||
import java.util.Map;
|
|
||||||
import java.util.NoSuchElementException;
|
|
||||||
import java.util.UUID;
|
|
||||||
|
|
||||||
@Service
|
@Service
|
||||||
public class ChatService {
|
public class ChatService {
|
||||||
@@ -35,26 +29,28 @@ public class ChatService {
|
|||||||
- Build answers from what is present: procedures, conditions, techniques, and descriptions all contribute; combine them into a rich, structured response
|
- Build answers from what is present: procedures, conditions, techniques, and descriptions all contribute; combine them into a rich, structured response
|
||||||
- Use clear structure: headings, bullet points, or numbered steps where appropriate to maximize clarity
|
- Use clear structure: headings, bullet points, or numbered steps where appropriate to maximize clarity
|
||||||
- Only say you cannot answer if the context is entirely unrelated to the question
|
- Only say you cannot answer if the context is entirely unrelated to the question
|
||||||
- Cite sources for each major point (book title and page number from the context metadata)
|
- Cite sources for each major point (book title and page number from the context)
|
||||||
|
- When referencing diagrams or figures, cite them as [Fig. X, p.N]
|
||||||
- Maintain continuity with the conversation history
|
- Maintain continuity with the conversation history
|
||||||
- Never fabricate clinical information not present in the context
|
- Never fabricate clinical information not present in the context
|
||||||
""";
|
""";
|
||||||
|
|
||||||
private final ChatClient chatClient;
|
private final ChatClient chatClient;
|
||||||
private final VectorStore vectorStore;
|
|
||||||
private final BookRepository bookRepository;
|
private final BookRepository bookRepository;
|
||||||
private final ChatSessionRepository sessionRepository;
|
private final ChatSessionRepository sessionRepository;
|
||||||
private final MessageRepository messageRepository;
|
private final MessageRepository messageRepository;
|
||||||
|
private final NeurosurgeryRetriever retriever;
|
||||||
|
|
||||||
public ChatService(ChatClient chatClient, VectorStore vectorStore,
|
public ChatService(ChatClient chatClient,
|
||||||
BookRepository bookRepository,
|
BookRepository bookRepository,
|
||||||
ChatSessionRepository sessionRepository,
|
ChatSessionRepository sessionRepository,
|
||||||
MessageRepository messageRepository) {
|
MessageRepository messageRepository,
|
||||||
|
NeurosurgeryRetriever retriever) {
|
||||||
this.chatClient = chatClient;
|
this.chatClient = chatClient;
|
||||||
this.vectorStore = vectorStore;
|
|
||||||
this.bookRepository = bookRepository;
|
this.bookRepository = bookRepository;
|
||||||
this.sessionRepository = sessionRepository;
|
this.sessionRepository = sessionRepository;
|
||||||
this.messageRepository = messageRepository;
|
this.messageRepository = messageRepository;
|
||||||
|
this.retriever = retriever;
|
||||||
}
|
}
|
||||||
|
|
||||||
public ChatSession createSession(String topicId) {
|
public ChatSession createSession(String topicId) {
|
||||||
@@ -73,7 +69,11 @@ public class ChatService {
|
|||||||
ChatSession session = sessionRepository.findById(sessionId)
|
ChatSession session = sessionRepository.findById(sessionId)
|
||||||
.orElseThrow(() -> new NoSuchElementException("Session not found."));
|
.orElseThrow(() -> new NoSuchElementException("Session not found."));
|
||||||
|
|
||||||
if (!bookRepository.existsByStatus(BookStatus.READY)) {
|
List<com.aiteacher.book.Book> readyBooks = bookRepository.findAll().stream()
|
||||||
|
.filter(b -> b.getStatus() == BookStatus.READY)
|
||||||
|
.toList();
|
||||||
|
|
||||||
|
if (readyBooks.isEmpty()) {
|
||||||
throw new NoKnowledgeSourceException("No books are available as knowledge sources.");
|
throw new NoKnowledgeSourceException("No books are available as knowledge sources.");
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -81,27 +81,31 @@ public class ChatService {
|
|||||||
Message userMessage = new Message(sessionId, MessageRole.USER, userContent);
|
Message userMessage = new Message(sessionId, MessageRole.USER, userContent);
|
||||||
messageRepository.save(userMessage);
|
messageRepository.save(userMessage);
|
||||||
|
|
||||||
// Build conversation history for context
|
// Build full question with conversation history
|
||||||
List<Message> history = messageRepository.findBySessionIdOrderByCreatedAtAsc(sessionId);
|
List<Message> history = messageRepository.findBySessionIdOrderByCreatedAtAsc(sessionId);
|
||||||
|
|
||||||
// Build the prompt with full conversation history as context
|
|
||||||
String fullQuestion = buildQuestionWithHistory(history, userContent, session.getTopicId());
|
String fullQuestion = buildQuestionWithHistory(history, userContent, session.getTopicId());
|
||||||
|
|
||||||
var qaAdvisor = QuestionAnswerAdvisor.builder(vectorStore)
|
// Retrieve context from all ready books (aggregate across books)
|
||||||
.searchRequest(SearchRequest.builder().similarityThreshold(0.5d).topK(6).build())
|
List<SectionEntity> allSections = new ArrayList<>();
|
||||||
.build();
|
List<FigureEntity> allFigures = new ArrayList<>();
|
||||||
|
for (com.aiteacher.book.Book book : readyBooks) {
|
||||||
|
RetrievalResult result = retriever.retrieve(fullQuestion, book.getId());
|
||||||
|
allSections.addAll(result.parentSections());
|
||||||
|
allFigures.addAll(result.figures());
|
||||||
|
}
|
||||||
|
|
||||||
ChatResponse response = chatClient.prompt()
|
// Build LLM prompt with section full texts and figure references
|
||||||
.advisors(qaAdvisor)
|
String contextPrompt = buildContextPrompt(fullQuestion, allSections, allFigures);
|
||||||
|
|
||||||
|
String assistantContent = chatClient.prompt()
|
||||||
.system(SYSTEM_PROMPT)
|
.system(SYSTEM_PROMPT)
|
||||||
.user(fullQuestion)
|
.user(contextPrompt)
|
||||||
.call()
|
.call()
|
||||||
.chatResponse();
|
.content();
|
||||||
|
|
||||||
String assistantContent = response.getResult().getOutput().getText();
|
// Build sources list with TEXT and FIGURE entries
|
||||||
List<Map<String, Object>> sources = extractSources(response);
|
List<Map<String, Object>> sources = buildSources(allSections, allFigures);
|
||||||
|
|
||||||
// Persist assistant message
|
|
||||||
Message assistantMessage = new Message(sessionId, MessageRole.ASSISTANT, assistantContent);
|
Message assistantMessage = new Message(sessionId, MessageRole.ASSISTANT, assistantContent);
|
||||||
assistantMessage.setSources(sources);
|
assistantMessage.setSources(sources);
|
||||||
return messageRepository.save(assistantMessage);
|
return messageRepository.save(assistantMessage);
|
||||||
@@ -118,24 +122,95 @@ public class ChatService {
|
|||||||
sessionRepository.deleteById(sessionId);
|
sessionRepository.deleteById(sessionId);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
// Private helpers
|
||||||
|
// -------------------------------------------------------------------------
|
||||||
|
|
||||||
|
private String buildContextPrompt(String question,
|
||||||
|
List<SectionEntity> sections,
|
||||||
|
List<FigureEntity> figures) {
|
||||||
|
StringBuilder sb = new StringBuilder();
|
||||||
|
|
||||||
|
if (!sections.isEmpty()) {
|
||||||
|
sb.append("CONTEXT:\n\n");
|
||||||
|
for (SectionEntity section : sections) {
|
||||||
|
sb.append("[").append(section.getTitle())
|
||||||
|
.append(", p.").append(section.getPageStart()).append("]\n");
|
||||||
|
sb.append(section.getFullText()).append("\n\n");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!figures.isEmpty()) {
|
||||||
|
sb.append("AVAILABLE FIGURES:\n");
|
||||||
|
for (FigureEntity figure : figures) {
|
||||||
|
sb.append("- ").append(figure.getLabel() != null ? figure.getLabel() : "Figure")
|
||||||
|
.append(" (p.").append(figure.getPage()).append("): ")
|
||||||
|
.append(figure.getCaption() != null ? figure.getCaption() : "")
|
||||||
|
.append("\n");
|
||||||
|
}
|
||||||
|
sb.append("\nWhen referencing diagrams, cite them as [Fig. X, p.N].\n\n");
|
||||||
|
}
|
||||||
|
|
||||||
|
sb.append("QUESTION:\n").append(question);
|
||||||
|
return sb.toString();
|
||||||
|
}
|
||||||
|
|
||||||
|
private List<Map<String, Object>> buildSources(List<SectionEntity> sections,
|
||||||
|
List<FigureEntity> figures) {
|
||||||
|
List<Map<String, Object>> sources = new ArrayList<>();
|
||||||
|
|
||||||
|
for (SectionEntity section : sections) {
|
||||||
|
Map<String, Object> source = new LinkedHashMap<>();
|
||||||
|
source.put("type", "TEXT");
|
||||||
|
source.put("bookTitle", deriveTitleFromSection(section));
|
||||||
|
source.put("page", section.getPageStart());
|
||||||
|
source.put("chunkText", truncate(section.getFullText(), 500));
|
||||||
|
sources.add(source);
|
||||||
|
}
|
||||||
|
|
||||||
|
for (FigureEntity figure : figures) {
|
||||||
|
Map<String, Object> source = new LinkedHashMap<>();
|
||||||
|
source.put("type", "FIGURE");
|
||||||
|
source.put("bookTitle", bookRepository.findById(figure.getBookId())
|
||||||
|
.map(com.aiteacher.book.Book::getTitle).orElse("Book"));
|
||||||
|
source.put("page", figure.getPage());
|
||||||
|
source.put("figureId", figure.getId());
|
||||||
|
source.put("label", figure.getLabel() != null ? figure.getLabel() : "");
|
||||||
|
source.put("caption", figure.getCaption() != null ? figure.getCaption() : "");
|
||||||
|
source.put("figureType", figure.getFigureType().name());
|
||||||
|
// imageUrl assembled from relative path: figures/{bookId}/{filename}
|
||||||
|
String filename = figure.getImagePath().substring(
|
||||||
|
figure.getImagePath().lastIndexOf('/') + 1);
|
||||||
|
source.put("imageUrl", "/api/v1/figures/" + figure.getBookId() + "/" + filename);
|
||||||
|
sources.add(source);
|
||||||
|
}
|
||||||
|
|
||||||
|
return sources;
|
||||||
|
}
|
||||||
|
|
||||||
|
private String deriveTitleFromSection(SectionEntity section) {
|
||||||
|
if (section == null) return "Book";
|
||||||
|
return bookRepository.findById(section.getBookId())
|
||||||
|
.map(com.aiteacher.book.Book::getTitle)
|
||||||
|
.orElse("Book");
|
||||||
|
}
|
||||||
|
|
||||||
private String buildQuestionWithHistory(List<Message> history, String currentQuestion,
|
private String buildQuestionWithHistory(List<Message> history, String currentQuestion,
|
||||||
String topicId) {
|
String topicId) {
|
||||||
boolean hasTopic = topicId != null && !topicId.equals("free-form");
|
boolean hasTopic = topicId != null && !topicId.equals("free-form");
|
||||||
|
|
||||||
if (history.size() <= 1) {
|
if (history.size() <= 1) {
|
||||||
return hasTopic
|
return hasTopic
|
||||||
? String.format("[Context: This is a question about the neurosurgery topic '%s']\n%s",
|
? String.format("[Context: question about neurosurgery topic '%s']\n%s",
|
||||||
topicId, currentQuestion)
|
topicId, currentQuestion)
|
||||||
: currentQuestion;
|
: currentQuestion;
|
||||||
}
|
}
|
||||||
|
|
||||||
StringBuilder sb = new StringBuilder();
|
StringBuilder sb = new StringBuilder();
|
||||||
if (hasTopic) {
|
if (hasTopic) {
|
||||||
sb.append(String.format("[Context: This conversation is about the neurosurgery topic '%s']\n\n",
|
sb.append(String.format("[Context: conversation about '%s']\n\n", topicId));
|
||||||
topicId));
|
|
||||||
}
|
}
|
||||||
sb.append("Previous conversation:\n");
|
sb.append("Previous conversation:\n");
|
||||||
// Include all messages except the last (which is the current user message just saved)
|
|
||||||
for (int i = 0; i < history.size() - 1; i++) {
|
for (int i = 0; i < history.size() - 1; i++) {
|
||||||
Message msg = history.get(i);
|
Message msg = history.get(i);
|
||||||
sb.append(msg.getRole().name()).append(": ").append(msg.getContent()).append("\n");
|
sb.append(msg.getRole().name()).append(": ").append(msg.getContent()).append("\n");
|
||||||
@@ -144,30 +219,8 @@ public class ChatService {
|
|||||||
return sb.toString();
|
return sb.toString();
|
||||||
}
|
}
|
||||||
|
|
||||||
private List<Map<String, Object>> extractSources(ChatResponse response) {
|
private String truncate(String text, int maxChars) {
|
||||||
List<Map<String, Object>> sources = new ArrayList<>();
|
if (text == null) return "";
|
||||||
|
return text.length() <= maxChars ? text : text.substring(0, maxChars) + "…";
|
||||||
if (response.getMetadata() != null) {
|
|
||||||
Object retrieved = response.getMetadata().get(QuestionAnswerAdvisor.RETRIEVED_DOCUMENTS);
|
|
||||||
if (retrieved instanceof List<?> docs) {
|
|
||||||
for (Object docObj : docs) {
|
|
||||||
if (docObj instanceof Document doc) {
|
|
||||||
Map<String, Object> metadata = doc.getMetadata();
|
|
||||||
String bookTitle = (String) metadata.get("book_title");
|
|
||||||
Object pageObj = metadata.get("page_number");
|
|
||||||
Integer page = pageObj instanceof Number n ? n.intValue() : null;
|
|
||||||
if (bookTitle != null) {
|
|
||||||
Map<String, Object> source = new HashMap<>();
|
|
||||||
source.put("bookTitle", bookTitle);
|
|
||||||
source.put("page", page);
|
|
||||||
source.put("chunkText", doc.getText());
|
|
||||||
sources.add(source);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return sources;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -0,0 +1,25 @@
|
|||||||
|
package com.aiteacher.config;
|
||||||
|
|
||||||
|
import org.springframework.beans.factory.annotation.Value;
|
||||||
|
import org.springframework.context.annotation.Configuration;
|
||||||
|
import org.springframework.web.servlet.config.annotation.ResourceHandlerRegistry;
|
||||||
|
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;
|
||||||
|
|
||||||
|
import java.nio.file.Paths;
|
||||||
|
|
||||||
|
@Configuration
|
||||||
|
public class FigureStorageConfig implements WebMvcConfigurer {
|
||||||
|
|
||||||
|
private final String basePath;
|
||||||
|
|
||||||
|
public FigureStorageConfig(@Value("${app.figure-storage.base-path:./uploads}") String basePath) {
|
||||||
|
this.basePath = Paths.get(basePath).toAbsolutePath().normalize().toString();
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public void addResourceHandlers(ResourceHandlerRegistry registry) {
|
||||||
|
// Serve GET /api/v1/figures/** from the local file store
|
||||||
|
registry.addResourceHandler("/api/v1/figures/**")
|
||||||
|
.addResourceLocations("file:" + basePath + "/figures/");
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,47 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import jakarta.persistence.*;
|
||||||
|
import java.time.Instant;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
@Entity
|
||||||
|
@Table(name = "chapter")
|
||||||
|
public class ChapterEntity {
|
||||||
|
|
||||||
|
@Id
|
||||||
|
@Column(name = "id", length = 200)
|
||||||
|
private String id;
|
||||||
|
|
||||||
|
@Column(name = "book_id", nullable = false)
|
||||||
|
private UUID bookId;
|
||||||
|
|
||||||
|
@Column(name = "number", nullable = false)
|
||||||
|
private int number;
|
||||||
|
|
||||||
|
@Column(name = "title", length = 500)
|
||||||
|
private String title;
|
||||||
|
|
||||||
|
@Column(name = "page_start")
|
||||||
|
private Integer pageStart;
|
||||||
|
|
||||||
|
@Column(name = "created_at", nullable = false)
|
||||||
|
private Instant createdAt;
|
||||||
|
|
||||||
|
public ChapterEntity() {}
|
||||||
|
|
||||||
|
public ChapterEntity(String id, UUID bookId, int number, String title, Integer pageStart) {
|
||||||
|
this.id = id;
|
||||||
|
this.bookId = bookId;
|
||||||
|
this.number = number;
|
||||||
|
this.title = title;
|
||||||
|
this.pageStart = pageStart;
|
||||||
|
this.createdAt = Instant.now();
|
||||||
|
}
|
||||||
|
|
||||||
|
public String getId() { return id; }
|
||||||
|
public UUID getBookId() { return bookId; }
|
||||||
|
public int getNumber() { return number; }
|
||||||
|
public String getTitle() { return title; }
|
||||||
|
public Integer getPageStart() { return pageStart; }
|
||||||
|
public Instant getCreatedAt() { return createdAt; }
|
||||||
|
}
|
||||||
@@ -0,0 +1,9 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.springframework.data.jpa.repository.JpaRepository;
|
||||||
|
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
public interface ChapterRepository extends JpaRepository<ChapterEntity, String> {
|
||||||
|
void deleteAllByBookId(UUID bookId);
|
||||||
|
}
|
||||||
@@ -0,0 +1,58 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import jakarta.persistence.*;
|
||||||
|
import java.io.Serializable;
|
||||||
|
import java.util.Objects;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
@Entity
|
||||||
|
@Table(name = "chunk_figure_ref")
|
||||||
|
@IdClass(ChunkFigureRefEntity.PK.class)
|
||||||
|
public class ChunkFigureRefEntity {
|
||||||
|
|
||||||
|
@Id
|
||||||
|
@Column(name = "chunk_id", nullable = false)
|
||||||
|
private UUID chunkId;
|
||||||
|
|
||||||
|
@Id
|
||||||
|
@Column(name = "figure_id", nullable = false, length = 200)
|
||||||
|
private String figureId;
|
||||||
|
|
||||||
|
@Column(name = "mention_page")
|
||||||
|
private Integer mentionPage;
|
||||||
|
|
||||||
|
public ChunkFigureRefEntity() {}
|
||||||
|
|
||||||
|
public ChunkFigureRefEntity(UUID chunkId, String figureId, Integer mentionPage) {
|
||||||
|
this.chunkId = chunkId;
|
||||||
|
this.figureId = figureId;
|
||||||
|
this.mentionPage = mentionPage;
|
||||||
|
}
|
||||||
|
|
||||||
|
public UUID getChunkId() { return chunkId; }
|
||||||
|
public String getFigureId() { return figureId; }
|
||||||
|
public Integer getMentionPage() { return mentionPage; }
|
||||||
|
|
||||||
|
public static class PK implements Serializable {
|
||||||
|
private UUID chunkId;
|
||||||
|
private String figureId;
|
||||||
|
|
||||||
|
public PK() {}
|
||||||
|
public PK(UUID chunkId, String figureId) {
|
||||||
|
this.chunkId = chunkId;
|
||||||
|
this.figureId = figureId;
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public boolean equals(Object o) {
|
||||||
|
if (this == o) return true;
|
||||||
|
if (!(o instanceof PK pk)) return false;
|
||||||
|
return Objects.equals(chunkId, pk.chunkId) && Objects.equals(figureId, pk.figureId);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public int hashCode() {
|
||||||
|
return Objects.hash(chunkId, figureId);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.springframework.data.jpa.repository.JpaRepository;
|
||||||
|
import org.springframework.data.jpa.repository.Query;
|
||||||
|
import org.springframework.data.repository.query.Param;
|
||||||
|
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
public interface ChunkFigureRefRepository extends JpaRepository<ChunkFigureRefEntity, ChunkFigureRefEntity.PK> {
|
||||||
|
|
||||||
|
@Query("SELECT r FROM ChunkFigureRefEntity r WHERE r.chunkId IN :chunkIds")
|
||||||
|
List<ChunkFigureRefEntity> findByChunkIdIn(@Param("chunkIds") List<UUID> chunkIds);
|
||||||
|
|
||||||
|
@Query("DELETE FROM ChunkFigureRefEntity r WHERE r.figureId IN :figureIds")
|
||||||
|
@org.springframework.data.jpa.repository.Modifying
|
||||||
|
void deleteByFigureIdIn(@Param("figureIds") List<String> figureIds);
|
||||||
|
}
|
||||||
@@ -0,0 +1,62 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.slf4j.Logger;
|
||||||
|
import org.slf4j.LoggerFactory;
|
||||||
|
import org.springframework.ai.document.Document;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.UUID;
|
||||||
|
import java.util.regex.Matcher;
|
||||||
|
import java.util.regex.Pattern;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Scans chunk text for "Fig. X" and "Figure X" references and persists
|
||||||
|
* ChunkFigureRefEntity rows linking that chunk to its referenced figures.
|
||||||
|
*/
|
||||||
|
@Service
|
||||||
|
public class ChunkFigureRefService {
|
||||||
|
|
||||||
|
private static final Logger log = LoggerFactory.getLogger(ChunkFigureRefService.class);
|
||||||
|
|
||||||
|
// Matches: "Fig. 12-4", "Fig. 12.4", "Fig 12", "Figure 12-4", etc.
|
||||||
|
private static final Pattern REF_PATTERN =
|
||||||
|
Pattern.compile("(?i)\\b(Fig\\.?|Figure)\\s+(\\d+[\\-.\\d]*)");
|
||||||
|
|
||||||
|
private final ChunkFigureRefRepository refRepository;
|
||||||
|
|
||||||
|
public ChunkFigureRefService(ChunkFigureRefRepository refRepository) {
|
||||||
|
this.refRepository = refRepository;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* For each text chunk, finds figure references and persists ChunkFigureRefEntity rows.
|
||||||
|
*/
|
||||||
|
public void linkChunksToFigures(List<Document> chunks, List<FigureEntity> bookFigures,
|
||||||
|
int pageNum) {
|
||||||
|
if (bookFigures.isEmpty()) return;
|
||||||
|
|
||||||
|
for (Document chunk : chunks) {
|
||||||
|
String chunkIdStr = chunk.getId();
|
||||||
|
UUID chunkId;
|
||||||
|
try {
|
||||||
|
chunkId = UUID.fromString(chunkIdStr);
|
||||||
|
} catch (IllegalArgumentException ex) {
|
||||||
|
log.warn("Chunk has non-UUID id: {}", chunkIdStr);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
Matcher m = REF_PATTERN.matcher(chunk.getText());
|
||||||
|
while (m.find()) {
|
||||||
|
String refNum = m.group(2).trim();
|
||||||
|
// Find matching figure by label suffix
|
||||||
|
for (FigureEntity figure : bookFigures) {
|
||||||
|
if (figure.getLabel() != null && figure.getLabel().endsWith(refNum)) {
|
||||||
|
refRepository.save(new ChunkFigureRefEntity(chunkId, figure.getId(), pageNum));
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,82 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import jakarta.persistence.*;
|
||||||
|
import java.time.Instant;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
@Entity
|
||||||
|
@Table(name = "figure")
|
||||||
|
public class FigureEntity {
|
||||||
|
|
||||||
|
@Id
|
||||||
|
@Column(name = "id", length = 200)
|
||||||
|
private String id;
|
||||||
|
|
||||||
|
@Column(name = "book_id", nullable = false)
|
||||||
|
private UUID bookId;
|
||||||
|
|
||||||
|
@Column(name = "section_id", length = 200)
|
||||||
|
private String sectionId;
|
||||||
|
|
||||||
|
@Column(name = "chapter_id", length = 200)
|
||||||
|
private String chapterId;
|
||||||
|
|
||||||
|
@Column(name = "label", length = 100)
|
||||||
|
private String label;
|
||||||
|
|
||||||
|
@Column(name = "caption", columnDefinition = "TEXT")
|
||||||
|
private String caption;
|
||||||
|
|
||||||
|
@Enumerated(EnumType.STRING)
|
||||||
|
@Column(name = "figure_type", nullable = false, length = 50)
|
||||||
|
private FigureType figureType;
|
||||||
|
|
||||||
|
@Column(name = "page", nullable = false)
|
||||||
|
private int page;
|
||||||
|
|
||||||
|
@Column(name = "image_path", nullable = false, length = 1000)
|
||||||
|
private String imagePath;
|
||||||
|
|
||||||
|
@Column(name = "caption_embedding_id")
|
||||||
|
private UUID captionEmbeddingId;
|
||||||
|
|
||||||
|
@Column(name = "created_at", nullable = false)
|
||||||
|
private Instant createdAt;
|
||||||
|
|
||||||
|
public FigureEntity() {}
|
||||||
|
|
||||||
|
public FigureEntity(String id, UUID bookId, String sectionId, String chapterId,
|
||||||
|
String label, String caption, FigureType figureType,
|
||||||
|
int page, String imagePath) {
|
||||||
|
this.id = id;
|
||||||
|
this.bookId = bookId;
|
||||||
|
this.sectionId = sectionId;
|
||||||
|
this.chapterId = chapterId;
|
||||||
|
this.label = label;
|
||||||
|
this.caption = caption;
|
||||||
|
this.figureType = figureType;
|
||||||
|
this.page = page;
|
||||||
|
this.imagePath = imagePath;
|
||||||
|
this.createdAt = Instant.now();
|
||||||
|
}
|
||||||
|
|
||||||
|
public String getId() { return id; }
|
||||||
|
public UUID getBookId() { return bookId; }
|
||||||
|
public String getSectionId() { return sectionId; }
|
||||||
|
public String getChapterId() { return chapterId; }
|
||||||
|
public String getLabel() { return label; }
|
||||||
|
public String getCaption() { return caption; }
|
||||||
|
public FigureType getFigureType() { return figureType; }
|
||||||
|
public int getPage() { return page; }
|
||||||
|
public String getImagePath() { return imagePath; }
|
||||||
|
public UUID getCaptionEmbeddingId() { return captionEmbeddingId; }
|
||||||
|
public Instant getCreatedAt() { return createdAt; }
|
||||||
|
|
||||||
|
public void setCaptionEmbeddingId(UUID captionEmbeddingId) {
|
||||||
|
this.captionEmbeddingId = captionEmbeddingId;
|
||||||
|
}
|
||||||
|
|
||||||
|
public void setCaption(String caption) {
|
||||||
|
this.caption = caption;
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,135 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import com.aiteacher.figure.FigureStorageService;
|
||||||
|
import org.apache.pdfbox.Loader;
|
||||||
|
import org.apache.pdfbox.cos.COSName;
|
||||||
|
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||||
|
import org.apache.pdfbox.pdmodel.PDPage;
|
||||||
|
import org.apache.pdfbox.pdmodel.graphics.PDXObject;
|
||||||
|
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;
|
||||||
|
import org.slf4j.Logger;
|
||||||
|
import org.slf4j.LoggerFactory;
|
||||||
|
import org.springframework.beans.factory.annotation.Value;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
|
||||||
|
import java.awt.image.BufferedImage;
|
||||||
|
import java.io.IOException;
|
||||||
|
import java.nio.file.Path;
|
||||||
|
import java.util.ArrayList;
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.UUID;
|
||||||
|
import java.util.regex.Matcher;
|
||||||
|
import java.util.regex.Pattern;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Extracts images from each PDF page using PDFBox.
|
||||||
|
* Images below the configured minimum size are skipped.
|
||||||
|
* Caption is detected by the "Fig." pattern in page text.
|
||||||
|
*/
|
||||||
|
@Service
|
||||||
|
public class FigureExtractionService {
|
||||||
|
|
||||||
|
private static final Logger log = LoggerFactory.getLogger(FigureExtractionService.class);
|
||||||
|
|
||||||
|
// Caption: line starting with "Fig." or "Figure" followed by a number
|
||||||
|
private static final Pattern CAPTION_PATTERN =
|
||||||
|
Pattern.compile("(?m)^(Fig\\.?\\s*\\d+[\\-.]?\\d*[^\\n]*)", Pattern.CASE_INSENSITIVE);
|
||||||
|
|
||||||
|
// Figure label: "Fig. 12-4" or "Fig. 12.4"
|
||||||
|
private static final Pattern LABEL_PATTERN =
|
||||||
|
Pattern.compile("(?i)Fig\\.?\\s*(\\d+[\\-.\\d]*)");
|
||||||
|
|
||||||
|
private final FigureStorageService storageService;
|
||||||
|
private final FigureRepository figureRepository;
|
||||||
|
private final int minImageSizePx;
|
||||||
|
|
||||||
|
public FigureExtractionService(
|
||||||
|
FigureStorageService storageService,
|
||||||
|
FigureRepository figureRepository,
|
||||||
|
@Value("${app.figure-storage.min-image-size-px:100}") int minImageSizePx) {
|
||||||
|
this.storageService = storageService;
|
||||||
|
this.figureRepository = figureRepository;
|
||||||
|
this.minImageSizePx = minImageSizePx;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Extracts all qualifying images from the PDF for the given book.
|
||||||
|
* Returns persisted FigureEntity list (without vision descriptions — set later).
|
||||||
|
*/
|
||||||
|
public List<FigureEntity> extract(UUID bookId, String chapterId,
|
||||||
|
List<SectionEntity> sections, Path pdfPath) {
|
||||||
|
List<FigureEntity> figures = new ArrayList<>();
|
||||||
|
int figureCounter = 0;
|
||||||
|
|
||||||
|
try (PDDocument doc = Loader.loadPDF(pdfPath.toFile())) {
|
||||||
|
for (SectionEntity section : sections) {
|
||||||
|
int pageIndex = section.getPageStart() - 1; // 0-based
|
||||||
|
if (pageIndex < 0 || pageIndex >= doc.getNumberOfPages()) continue;
|
||||||
|
|
||||||
|
PDPage page = doc.getPage(pageIndex);
|
||||||
|
String pageText = section.getFullText();
|
||||||
|
|
||||||
|
try {
|
||||||
|
for (COSName name : page.getResources().getXObjectNames()) {
|
||||||
|
PDXObject xObject = page.getResources().getXObject(name);
|
||||||
|
if (!(xObject instanceof PDImageXObject image)) continue;
|
||||||
|
|
||||||
|
BufferedImage bufferedImage = image.getImage();
|
||||||
|
if (bufferedImage.getWidth() < minImageSizePx
|
||||||
|
|| bufferedImage.getHeight() < minImageSizePx) {
|
||||||
|
continue; // skip decorative images
|
||||||
|
}
|
||||||
|
|
||||||
|
figureCounter++;
|
||||||
|
String figureId = bookId + "-fig-" + pageIndex + "-" + figureCounter;
|
||||||
|
String caption = detectCaption(pageText);
|
||||||
|
String label = detectLabel(caption, figureCounter);
|
||||||
|
FigureType type = classifyType(caption, pageText);
|
||||||
|
|
||||||
|
String imagePath = storageService.save(bookId, figureId, bufferedImage);
|
||||||
|
|
||||||
|
FigureEntity figure = new FigureEntity(
|
||||||
|
figureId, bookId, section.getId(), chapterId,
|
||||||
|
label, caption, type, section.getPageStart(), imagePath
|
||||||
|
);
|
||||||
|
figures.add(figureRepository.save(figure));
|
||||||
|
}
|
||||||
|
} catch (IOException ex) {
|
||||||
|
log.warn("Failed to extract images from page {} of book {}: {}",
|
||||||
|
section.getPageStart(), bookId, ex.getMessage());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (IOException ex) {
|
||||||
|
log.error("Could not open PDF for image extraction, book {}", bookId, ex);
|
||||||
|
}
|
||||||
|
|
||||||
|
log.info("Extracted {} figures for book {}", figures.size(), bookId);
|
||||||
|
return figures;
|
||||||
|
}
|
||||||
|
|
||||||
|
private String detectCaption(String pageText) {
|
||||||
|
if (pageText == null) return null;
|
||||||
|
Matcher m = CAPTION_PATTERN.matcher(pageText);
|
||||||
|
return m.find() ? m.group(1).trim() : null;
|
||||||
|
}
|
||||||
|
|
||||||
|
private String detectLabel(String caption, int counter) {
|
||||||
|
if (caption != null) {
|
||||||
|
Matcher m = LABEL_PATTERN.matcher(caption);
|
||||||
|
if (m.find()) return "Fig. " + m.group(1).trim();
|
||||||
|
}
|
||||||
|
return "Fig. " + counter;
|
||||||
|
}
|
||||||
|
|
||||||
|
private FigureType classifyType(String caption, String pageText) {
|
||||||
|
String combined = ((caption != null ? caption : "") + " " + (pageText != null ? pageText : "")).toLowerCase();
|
||||||
|
if (combined.contains("mri") || combined.contains("ct ") || combined.contains("magnetic")
|
||||||
|
|| combined.contains("tomography")) return FigureType.MRI_CT_SCAN;
|
||||||
|
if (combined.contains("intraoperative") || combined.contains("intra-op")) return FigureType.INTRAOPERATIVE_IMAGE;
|
||||||
|
if (caption != null && caption.toLowerCase().startsWith("table")) return FigureType.TABLE;
|
||||||
|
if (combined.contains("chart") || combined.contains("histogram") || combined.contains("graph"))
|
||||||
|
return FigureType.CHART;
|
||||||
|
if (combined.contains("photograph") || combined.contains("photo")) return FigureType.SURGICAL_PHOTOGRAPH;
|
||||||
|
return FigureType.ANATOMICAL_DIAGRAM;
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.springframework.data.jpa.repository.JpaRepository;
|
||||||
|
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
public interface FigureRepository extends JpaRepository<FigureEntity, String> {
|
||||||
|
List<FigureEntity> findAllByBookId(UUID bookId);
|
||||||
|
void deleteAllByBookId(UUID bookId);
|
||||||
|
}
|
||||||
@@ -0,0 +1,10 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
public enum FigureType {
|
||||||
|
ANATOMICAL_DIAGRAM,
|
||||||
|
SURGICAL_PHOTOGRAPH,
|
||||||
|
MRI_CT_SCAN,
|
||||||
|
TABLE,
|
||||||
|
CHART,
|
||||||
|
INTRAOPERATIVE_IMAGE
|
||||||
|
}
|
||||||
@@ -0,0 +1,71 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.slf4j.Logger;
|
||||||
|
import org.slf4j.LoggerFactory;
|
||||||
|
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
|
||||||
|
import org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig;
|
||||||
|
import org.springframework.core.io.FileSystemResource;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
import org.springframework.transaction.annotation.Transactional;
|
||||||
|
|
||||||
|
import java.nio.file.Path;
|
||||||
|
import java.util.ArrayList;
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Parses a PDF into page-level SectionEntity records stored in Postgres.
|
||||||
|
* Each page becomes one section, grouped under a single chapter per book.
|
||||||
|
*/
|
||||||
|
@Service
|
||||||
|
public class PdfStructureParser {
|
||||||
|
|
||||||
|
private static final Logger log = LoggerFactory.getLogger(PdfStructureParser.class);
|
||||||
|
|
||||||
|
private final ChapterRepository chapterRepository;
|
||||||
|
private final SectionRepository sectionRepository;
|
||||||
|
|
||||||
|
public PdfStructureParser(ChapterRepository chapterRepository,
|
||||||
|
SectionRepository sectionRepository) {
|
||||||
|
this.chapterRepository = chapterRepository;
|
||||||
|
this.sectionRepository = sectionRepository;
|
||||||
|
}
|
||||||
|
|
||||||
|
@Transactional
|
||||||
|
public List<SectionEntity> parse(UUID bookId, String bookTitle, Path pdfPath) {
|
||||||
|
log.info("Parsing PDF structure for book {}", bookId);
|
||||||
|
|
||||||
|
// One chapter per book
|
||||||
|
String chapterId = bookId + "-ch1";
|
||||||
|
ChapterEntity chapter = new ChapterEntity(chapterId, bookId, 1, bookTitle, 1);
|
||||||
|
chapterRepository.save(chapter);
|
||||||
|
|
||||||
|
// One section per page
|
||||||
|
PagePdfDocumentReader reader = new PagePdfDocumentReader(
|
||||||
|
new FileSystemResource(pdfPath.toFile()),
|
||||||
|
PdfDocumentReaderConfig.builder().withPagesPerDocument(1).build()
|
||||||
|
);
|
||||||
|
|
||||||
|
List<org.springframework.ai.document.Document> pages = reader.get();
|
||||||
|
List<SectionEntity> sections = new ArrayList<>();
|
||||||
|
|
||||||
|
for (int i = 0; i < pages.size(); i++) {
|
||||||
|
int pageNum = i + 1;
|
||||||
|
String text = pages.get(i).getText();
|
||||||
|
if (text == null || text.isBlank()) continue;
|
||||||
|
|
||||||
|
String sectionId = bookId + "-p" + pageNum;
|
||||||
|
SectionEntity section = new SectionEntity(
|
||||||
|
sectionId, chapterId, bookId,
|
||||||
|
String.valueOf(pageNum),
|
||||||
|
"Page " + pageNum,
|
||||||
|
pageNum, pageNum,
|
||||||
|
text
|
||||||
|
);
|
||||||
|
sections.add(sectionRepository.save(section));
|
||||||
|
}
|
||||||
|
|
||||||
|
log.info("Parsed {} sections for book {}", sections.size(), bookId);
|
||||||
|
return sections;
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,63 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import jakarta.persistence.*;
|
||||||
|
import java.time.Instant;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
@Entity
|
||||||
|
@Table(name = "section")
|
||||||
|
public class SectionEntity {
|
||||||
|
|
||||||
|
@Id
|
||||||
|
@Column(name = "id", length = 200)
|
||||||
|
private String id;
|
||||||
|
|
||||||
|
@Column(name = "chapter_id", nullable = false, length = 200)
|
||||||
|
private String chapterId;
|
||||||
|
|
||||||
|
@Column(name = "book_id", nullable = false)
|
||||||
|
private UUID bookId;
|
||||||
|
|
||||||
|
@Column(name = "number", length = 50)
|
||||||
|
private String number;
|
||||||
|
|
||||||
|
@Column(name = "title", length = 500)
|
||||||
|
private String title;
|
||||||
|
|
||||||
|
@Column(name = "page_start", nullable = false)
|
||||||
|
private int pageStart;
|
||||||
|
|
||||||
|
@Column(name = "page_end", nullable = false)
|
||||||
|
private int pageEnd;
|
||||||
|
|
||||||
|
@Column(name = "full_text", nullable = false, columnDefinition = "TEXT")
|
||||||
|
private String fullText;
|
||||||
|
|
||||||
|
@Column(name = "created_at", nullable = false)
|
||||||
|
private Instant createdAt;
|
||||||
|
|
||||||
|
public SectionEntity() {}
|
||||||
|
|
||||||
|
public SectionEntity(String id, String chapterId, UUID bookId, String number,
|
||||||
|
String title, int pageStart, int pageEnd, String fullText) {
|
||||||
|
this.id = id;
|
||||||
|
this.chapterId = chapterId;
|
||||||
|
this.bookId = bookId;
|
||||||
|
this.number = number;
|
||||||
|
this.title = title;
|
||||||
|
this.pageStart = pageStart;
|
||||||
|
this.pageEnd = pageEnd;
|
||||||
|
this.fullText = fullText;
|
||||||
|
this.createdAt = Instant.now();
|
||||||
|
}
|
||||||
|
|
||||||
|
public String getId() { return id; }
|
||||||
|
public String getChapterId() { return chapterId; }
|
||||||
|
public UUID getBookId() { return bookId; }
|
||||||
|
public String getNumber() { return number; }
|
||||||
|
public String getTitle() { return title; }
|
||||||
|
public int getPageStart() { return pageStart; }
|
||||||
|
public int getPageEnd() { return pageEnd; }
|
||||||
|
public String getFullText() { return fullText; }
|
||||||
|
public Instant getCreatedAt() { return createdAt; }
|
||||||
|
}
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.springframework.data.jpa.repository.JpaRepository;
|
||||||
|
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
public interface SectionRepository extends JpaRepository<SectionEntity, String> {
|
||||||
|
List<SectionEntity> findAllByBookId(UUID bookId);
|
||||||
|
void deleteAllByBookId(UUID bookId);
|
||||||
|
}
|
||||||
@@ -0,0 +1,65 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.springframework.ai.document.Document;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
|
||||||
|
import java.util.ArrayList;
|
||||||
|
import java.util.HashMap;
|
||||||
|
import java.util.List;
|
||||||
|
import java.util.Map;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Splits a SectionEntity's full text into overlapping chunks for vector embedding.
|
||||||
|
* Target size: ~1800 characters (~450 tokens); overlap: 200 characters.
|
||||||
|
*/
|
||||||
|
@Service
|
||||||
|
public class TextChunkingService {
|
||||||
|
|
||||||
|
private static final int TARGET_CHARS = 1800;
|
||||||
|
private static final int OVERLAP_CHARS = 200;
|
||||||
|
|
||||||
|
public List<Document> chunk(SectionEntity section, String bookTitle) {
|
||||||
|
String text = section.getFullText();
|
||||||
|
if (text == null || text.isBlank()) return List.of();
|
||||||
|
|
||||||
|
List<String> windows = split(text);
|
||||||
|
List<Document> documents = new ArrayList<>();
|
||||||
|
|
||||||
|
for (int i = 0; i < windows.size(); i++) {
|
||||||
|
String chunkId = UUID.randomUUID().toString();
|
||||||
|
Map<String, Object> metadata = buildMetadata(section, bookTitle, i, windows.size(), chunkId);
|
||||||
|
documents.add(new Document(chunkId, windows.get(i), metadata));
|
||||||
|
}
|
||||||
|
return documents;
|
||||||
|
}
|
||||||
|
|
||||||
|
private List<String> split(String text) {
|
||||||
|
List<String> windows = new ArrayList<>();
|
||||||
|
int start = 0;
|
||||||
|
while (start < text.length()) {
|
||||||
|
int end = Math.min(start + TARGET_CHARS, text.length());
|
||||||
|
windows.add(text.substring(start, end));
|
||||||
|
if (end == text.length()) break;
|
||||||
|
start = end - OVERLAP_CHARS;
|
||||||
|
}
|
||||||
|
return windows;
|
||||||
|
}
|
||||||
|
|
||||||
|
private Map<String, Object> buildMetadata(SectionEntity section, String bookTitle,
|
||||||
|
int index, int total, String chunkId) {
|
||||||
|
Map<String, Object> m = new HashMap<>();
|
||||||
|
m.put("type", "TEXT");
|
||||||
|
m.put("book_id", section.getBookId().toString());
|
||||||
|
m.put("book_title", bookTitle);
|
||||||
|
m.put("chapter_id", section.getChapterId());
|
||||||
|
m.put("section_id", section.getId());
|
||||||
|
m.put("section_title", section.getTitle() != null ? section.getTitle() : "");
|
||||||
|
m.put("page_start", section.getPageStart());
|
||||||
|
m.put("page_end", section.getPageEnd());
|
||||||
|
m.put("chunk_index", index);
|
||||||
|
m.put("total_chunks", total);
|
||||||
|
m.put("chunk_id", chunkId);
|
||||||
|
return m;
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,49 @@
|
|||||||
|
package com.aiteacher.document;
|
||||||
|
|
||||||
|
import org.slf4j.Logger;
|
||||||
|
import org.slf4j.LoggerFactory;
|
||||||
|
import org.springframework.ai.chat.client.ChatClient;
|
||||||
|
import org.springframework.core.io.FileSystemResource;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
import org.springframework.util.MimeTypeUtils;
|
||||||
|
|
||||||
|
import java.nio.file.Path;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Generates a clinical text description for an extracted figure image
|
||||||
|
* using the OpenAI vision model via Spring AI ChatClient.
|
||||||
|
*/
|
||||||
|
@Service
|
||||||
|
public class VisionDescriptionService {
|
||||||
|
|
||||||
|
private static final Logger log = LoggerFactory.getLogger(VisionDescriptionService.class);
|
||||||
|
|
||||||
|
private static final String PROMPT =
|
||||||
|
"You are a neurosurgery educator. Provide a brief 2-3 sentence clinical description of " +
|
||||||
|
"this image. Focus on anatomical structures, surgical landmarks, labels, and clinical " +
|
||||||
|
"significance. If text or labels are visible, include them verbatim.";
|
||||||
|
|
||||||
|
private final ChatClient chatClient;
|
||||||
|
|
||||||
|
public VisionDescriptionService(ChatClient chatClient) {
|
||||||
|
this.chatClient = chatClient;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns a description string. Falls back to the provided caption if vision fails.
|
||||||
|
*/
|
||||||
|
public String describe(Path imagePath, String captionFallback) {
|
||||||
|
try {
|
||||||
|
return chatClient.prompt()
|
||||||
|
.user(u -> u
|
||||||
|
.text(PROMPT)
|
||||||
|
.media(MimeTypeUtils.IMAGE_PNG, new FileSystemResource(imagePath.toFile())))
|
||||||
|
.call()
|
||||||
|
.content();
|
||||||
|
} catch (Exception ex) {
|
||||||
|
log.warn("Vision description failed for {}: {} — using caption as fallback",
|
||||||
|
imagePath.getFileName(), ex.getMessage());
|
||||||
|
return captionFallback != null ? captionFallback : "Figure";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,24 @@
|
|||||||
|
package com.aiteacher.figure;
|
||||||
|
|
||||||
|
import java.awt.image.BufferedImage;
|
||||||
|
import java.nio.file.Path;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
public interface FigureStorageService {
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Saves an extracted image to the figure store and returns the relative path
|
||||||
|
* (relative to the configured base-path) stored in the database.
|
||||||
|
*/
|
||||||
|
String save(UUID bookId, String figureId, BufferedImage image);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Resolves a stored relative path to an absolute filesystem path.
|
||||||
|
*/
|
||||||
|
Path resolve(String relativePath);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Deletes all figure files for the given book.
|
||||||
|
*/
|
||||||
|
void deleteAll(UUID bookId);
|
||||||
|
}
|
||||||
@@ -0,0 +1,59 @@
|
|||||||
|
package com.aiteacher.figure;
|
||||||
|
|
||||||
|
import org.slf4j.Logger;
|
||||||
|
import org.slf4j.LoggerFactory;
|
||||||
|
import org.springframework.beans.factory.annotation.Value;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
|
||||||
|
import javax.imageio.ImageIO;
|
||||||
|
import java.awt.image.BufferedImage;
|
||||||
|
import java.io.IOException;
|
||||||
|
import java.nio.file.Files;
|
||||||
|
import java.nio.file.Path;
|
||||||
|
import java.nio.file.Paths;
|
||||||
|
import java.util.UUID;
|
||||||
|
|
||||||
|
@Service
|
||||||
|
public class LocalFigureStorageService implements FigureStorageService {
|
||||||
|
|
||||||
|
private static final Logger log = LoggerFactory.getLogger(LocalFigureStorageService.class);
|
||||||
|
|
||||||
|
private final Path basePath;
|
||||||
|
|
||||||
|
public LocalFigureStorageService(@Value("${app.figure-storage.base-path:./uploads}") String basePath) {
|
||||||
|
this.basePath = Paths.get(basePath).toAbsolutePath().normalize();
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public String save(UUID bookId, String figureId, BufferedImage image) {
|
||||||
|
try {
|
||||||
|
Path dir = basePath.resolve("figures").resolve(bookId.toString());
|
||||||
|
Files.createDirectories(dir);
|
||||||
|
String filename = figureId + ".png";
|
||||||
|
Path file = dir.resolve(filename);
|
||||||
|
ImageIO.write(image, "PNG", file.toFile());
|
||||||
|
// Return relative path for storage in DB
|
||||||
|
return "figures/" + bookId + "/" + filename;
|
||||||
|
} catch (IOException ex) {
|
||||||
|
throw new RuntimeException("Failed to save figure " + figureId, ex);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public Path resolve(String relativePath) {
|
||||||
|
return basePath.resolve(relativePath);
|
||||||
|
}
|
||||||
|
|
||||||
|
@Override
|
||||||
|
public void deleteAll(UUID bookId) {
|
||||||
|
Path dir = basePath.resolve("figures").resolve(bookId.toString());
|
||||||
|
if (!Files.exists(dir)) return;
|
||||||
|
try (var walk = Files.walk(dir)) {
|
||||||
|
walk.sorted(java.util.Comparator.reverseOrder())
|
||||||
|
.map(Path::toFile)
|
||||||
|
.forEach(java.io.File::delete);
|
||||||
|
} catch (IOException ex) {
|
||||||
|
log.warn("Could not fully delete figures for book {}: {}", bookId, ex.getMessage());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,111 @@
|
|||||||
|
package com.aiteacher.retrieval;
|
||||||
|
|
||||||
|
import com.aiteacher.document.*;
|
||||||
|
import org.slf4j.Logger;
|
||||||
|
import org.slf4j.LoggerFactory;
|
||||||
|
import org.springframework.ai.document.Document;
|
||||||
|
import org.springframework.ai.vectorstore.SearchRequest;
|
||||||
|
import org.springframework.ai.vectorstore.VectorStore;
|
||||||
|
import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
|
||||||
|
import org.springframework.stereotype.Service;
|
||||||
|
|
||||||
|
import java.util.*;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Dual-modality retriever: searches text chunks and figure captions independently,
|
||||||
|
* then expands text hits to their parent sections and merges linked figures.
|
||||||
|
*/
|
||||||
|
@Service
|
||||||
|
public class NeurosurgeryRetriever {
|
||||||
|
|
||||||
|
private static final Logger log = LoggerFactory.getLogger(NeurosurgeryRetriever.class);
|
||||||
|
|
||||||
|
private static final int TEXT_TOP_K = 5;
|
||||||
|
private static final int FIGURE_TOP_K = 3;
|
||||||
|
|
||||||
|
private final VectorStore vectorStore;
|
||||||
|
private final SectionRepository sectionRepository;
|
||||||
|
private final FigureRepository figureRepository;
|
||||||
|
private final ChunkFigureRefRepository chunkFigureRefRepository;
|
||||||
|
|
||||||
|
public NeurosurgeryRetriever(VectorStore vectorStore,
|
||||||
|
SectionRepository sectionRepository,
|
||||||
|
FigureRepository figureRepository,
|
||||||
|
ChunkFigureRefRepository chunkFigureRefRepository) {
|
||||||
|
this.vectorStore = vectorStore;
|
||||||
|
this.sectionRepository = sectionRepository;
|
||||||
|
this.figureRepository = figureRepository;
|
||||||
|
this.chunkFigureRefRepository = chunkFigureRefRepository;
|
||||||
|
}
|
||||||
|
|
||||||
|
public RetrievalResult retrieve(String query, UUID bookId) {
|
||||||
|
FilterExpressionBuilder b = new FilterExpressionBuilder();
|
||||||
|
|
||||||
|
// 1. Text chunk search
|
||||||
|
List<Document> textHits = vectorStore.similaritySearch(
|
||||||
|
SearchRequest.builder()
|
||||||
|
.query(query)
|
||||||
|
.topK(TEXT_TOP_K)
|
||||||
|
.filterExpression(b.and(
|
||||||
|
b.eq("type", "TEXT"),
|
||||||
|
b.eq("book_id", bookId.toString())
|
||||||
|
).build())
|
||||||
|
.build()
|
||||||
|
);
|
||||||
|
|
||||||
|
// 2. Figure caption search (independent topK)
|
||||||
|
List<Document> figureHits = vectorStore.similaritySearch(
|
||||||
|
SearchRequest.builder()
|
||||||
|
.query(query)
|
||||||
|
.topK(FIGURE_TOP_K)
|
||||||
|
.filterExpression(b.and(
|
||||||
|
b.eq("type", "FIGURE"),
|
||||||
|
b.eq("book_id", bookId.toString())
|
||||||
|
).build())
|
||||||
|
.build()
|
||||||
|
);
|
||||||
|
|
||||||
|
// 3. Expand text chunks to parent sections from Postgres
|
||||||
|
List<String> sectionIds = textHits.stream()
|
||||||
|
.map(d -> (String) d.getMetadata().get("section_id"))
|
||||||
|
.filter(Objects::nonNull)
|
||||||
|
.distinct()
|
||||||
|
.toList();
|
||||||
|
List<SectionEntity> sections = sectionIds.isEmpty()
|
||||||
|
? List.of()
|
||||||
|
: sectionRepository.findAllById(sectionIds);
|
||||||
|
|
||||||
|
// 4. Fetch figures explicitly linked to retrieved chunks
|
||||||
|
List<UUID> chunkIds = textHits.stream()
|
||||||
|
.map(d -> {
|
||||||
|
try { return UUID.fromString(d.getId()); }
|
||||||
|
catch (Exception e) { return null; }
|
||||||
|
})
|
||||||
|
.filter(Objects::nonNull)
|
||||||
|
.toList();
|
||||||
|
List<String> linkedFigureIds = chunkIds.isEmpty()
|
||||||
|
? List.of()
|
||||||
|
: chunkFigureRefRepository.findByChunkIdIn(chunkIds)
|
||||||
|
.stream().map(ChunkFigureRefEntity::getFigureId).distinct().toList();
|
||||||
|
List<FigureEntity> linkedFigures = linkedFigureIds.isEmpty()
|
||||||
|
? List.of()
|
||||||
|
: figureRepository.findAllById(linkedFigureIds);
|
||||||
|
|
||||||
|
// 5. Collect figures from semantic figure search
|
||||||
|
List<String> semanticFigureIds = figureHits.stream()
|
||||||
|
.map(d -> (String) d.getMetadata().get("figure_id"))
|
||||||
|
.filter(Objects::nonNull)
|
||||||
|
.toList();
|
||||||
|
List<FigureEntity> semanticFigures = semanticFigureIds.isEmpty()
|
||||||
|
? List.of()
|
||||||
|
: figureRepository.findAllById(semanticFigureIds);
|
||||||
|
|
||||||
|
// 6. Merge and deduplicate figures by figureId (linked figures take precedence)
|
||||||
|
Map<String, FigureEntity> merged = new LinkedHashMap<>();
|
||||||
|
linkedFigures.forEach(f -> merged.put(f.getId(), f));
|
||||||
|
semanticFigures.forEach(f -> merged.putIfAbsent(f.getId(), f));
|
||||||
|
|
||||||
|
log.debug("Retrieved {} sections, {} figures for query", sections.size(), merged.size());
|
||||||
|
return new RetrievalResult(sections, new ArrayList<>(merged.values()));
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,11 @@
|
|||||||
|
package com.aiteacher.retrieval;
|
||||||
|
|
||||||
|
import com.aiteacher.document.FigureEntity;
|
||||||
|
import com.aiteacher.document.SectionEntity;
|
||||||
|
|
||||||
|
import java.util.List;
|
||||||
|
|
||||||
|
public record RetrievalResult(
|
||||||
|
List<SectionEntity> parentSections,
|
||||||
|
List<FigureEntity> figures
|
||||||
|
) {}
|
||||||
@@ -47,6 +47,16 @@ spring:
|
|||||||
max-size: 8
|
max-size: 8
|
||||||
queue-capacity: 50
|
queue-capacity: 50
|
||||||
|
|
||||||
|
logging:
|
||||||
|
level:
|
||||||
|
"[org.apache.pdfbox]": ERROR
|
||||||
|
|
||||||
app:
|
app:
|
||||||
auth:
|
auth:
|
||||||
password: ${APP_PASSWORD:changeme}
|
password: ${APP_PASSWORD:changeme}
|
||||||
|
figure-storage:
|
||||||
|
base-path: ${FIGURE_STORAGE_PATH:./uploads}
|
||||||
|
min-image-size-px: 100
|
||||||
|
embedding:
|
||||||
|
batch-size: 20
|
||||||
|
batch-delay-ms: 2000
|
||||||
|
|||||||
@@ -0,0 +1,28 @@
|
|||||||
|
-- ============================================================
|
||||||
|
-- V4: Document hierarchy — chapter and section tables
|
||||||
|
-- Supports parent-child retrieval pattern for RAG precision.
|
||||||
|
-- ============================================================
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS chapter (
|
||||||
|
id VARCHAR(200) PRIMARY KEY,
|
||||||
|
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||||
|
number INT NOT NULL DEFAULT 1,
|
||||||
|
title VARCHAR(500),
|
||||||
|
page_start INT,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS section (
|
||||||
|
id VARCHAR(200) PRIMARY KEY,
|
||||||
|
chapter_id VARCHAR(200) NOT NULL REFERENCES chapter(id) ON DELETE CASCADE,
|
||||||
|
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||||
|
number VARCHAR(50),
|
||||||
|
title VARCHAR(500),
|
||||||
|
page_start INT NOT NULL,
|
||||||
|
page_end INT NOT NULL,
|
||||||
|
full_text TEXT NOT NULL,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_section_book ON section(book_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_section_chapter ON section(chapter_id);
|
||||||
@@ -0,0 +1,29 @@
|
|||||||
|
-- ============================================================
|
||||||
|
-- V5: Figures and chunk-to-figure reference table
|
||||||
|
-- figure: metadata + file path for each extracted image
|
||||||
|
-- chunk_figure_ref: links vector-store chunks to figures
|
||||||
|
-- ============================================================
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS figure (
|
||||||
|
id VARCHAR(200) PRIMARY KEY,
|
||||||
|
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||||
|
section_id VARCHAR(200) REFERENCES section(id) ON DELETE SET NULL,
|
||||||
|
chapter_id VARCHAR(200) REFERENCES chapter(id) ON DELETE SET NULL,
|
||||||
|
label VARCHAR(100),
|
||||||
|
caption TEXT,
|
||||||
|
figure_type VARCHAR(50) NOT NULL,
|
||||||
|
page INT NOT NULL,
|
||||||
|
image_path VARCHAR(1000) NOT NULL,
|
||||||
|
caption_embedding_id UUID,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS chunk_figure_ref (
|
||||||
|
chunk_id UUID NOT NULL,
|
||||||
|
figure_id VARCHAR(200) NOT NULL REFERENCES figure(id) ON DELETE CASCADE,
|
||||||
|
mention_page INT,
|
||||||
|
PRIMARY KEY (chunk_id, figure_id)
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_figure_book ON figure(book_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_cfr_chunk ON chunk_figure_ref(chunk_id);
|
||||||
@@ -5,22 +5,47 @@
|
|||||||
<div v-if="isUser" class="message-content">{{ message.content }}</div>
|
<div v-if="isUser" class="message-content">{{ message.content }}</div>
|
||||||
<div v-else class="message-content message-content--markdown" v-html="renderedContent"></div>
|
<div v-else class="message-content message-content--markdown" v-html="renderedContent"></div>
|
||||||
|
|
||||||
<!-- Source chips for assistant messages -->
|
<!-- Sources for assistant messages -->
|
||||||
<div v-if="!isUser && message.sources && message.sources.length > 0" class="message-sources">
|
<div v-if="!isUser && message.sources && message.sources.length > 0" class="message-sources">
|
||||||
<div class="sources-label">Sources:</div>
|
<div class="sources-label">Sources:</div>
|
||||||
<div class="source-list">
|
<div class="source-list">
|
||||||
|
<!-- TEXT sources -->
|
||||||
<div
|
<div
|
||||||
v-for="(source, idx) in message.sources"
|
v-for="(source, idx) in textSources"
|
||||||
:key="idx"
|
:key="'text-' + idx"
|
||||||
class="source-item"
|
class="source-item"
|
||||||
>
|
>
|
||||||
<div class="source-chip">
|
<div class="source-chip source-chip--text">
|
||||||
<span class="source-book-icon">📖</span>
|
<span class="source-icon">📖</span>
|
||||||
<span class="source-book-title">{{ source.bookTitle }}</span>
|
<span class="source-book-title">{{ source.bookTitle }}</span>
|
||||||
<span v-if="source.page" class="source-page">p. {{ source.page }}</span>
|
<span v-if="source.page" class="source-page">p. {{ source.page }}</span>
|
||||||
</div>
|
</div>
|
||||||
<div v-if="source.chunkText" class="source-chunk">{{ source.chunkText }}</div>
|
<div v-if="source.chunkText" class="source-chunk">{{ source.chunkText }}</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<!-- FIGURE sources -->
|
||||||
|
<div
|
||||||
|
v-for="(source, idx) in figureSources"
|
||||||
|
:key="'fig-' + idx"
|
||||||
|
class="source-item source-item--figure"
|
||||||
|
>
|
||||||
|
<div class="source-chip source-chip--figure">
|
||||||
|
<span class="source-icon">🖼️</span>
|
||||||
|
<span class="source-figure-label">{{ source.label || 'Figure' }}</span>
|
||||||
|
<span v-if="source.page" class="source-page">p. {{ source.page }}</span>
|
||||||
|
<span v-if="source.figureType" class="source-figure-type">{{ formatFigureType(source.figureType) }}</span>
|
||||||
|
</div>
|
||||||
|
<div v-if="source.caption" class="source-caption">{{ source.caption }}</div>
|
||||||
|
<div class="source-figure-image">
|
||||||
|
<img
|
||||||
|
:src="source.imageUrl"
|
||||||
|
:alt="source.caption || source.label || 'Figure'"
|
||||||
|
class="figure-img"
|
||||||
|
loading="lazy"
|
||||||
|
@error="onImageError"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
@@ -32,7 +57,7 @@
|
|||||||
<script setup lang="ts">
|
<script setup lang="ts">
|
||||||
import { computed } from 'vue'
|
import { computed } from 'vue'
|
||||||
import { marked } from 'marked'
|
import { marked } from 'marked'
|
||||||
import type { ChatMessage } from '@/stores/chatStore'
|
import type { ChatMessage, ChatSource } from '@/stores/chatStore'
|
||||||
|
|
||||||
const props = defineProps<{
|
const props = defineProps<{
|
||||||
message: ChatMessage
|
message: ChatMessage
|
||||||
@@ -41,6 +66,36 @@ const props = defineProps<{
|
|||||||
const isUser = computed(() => props.message.role === 'USER')
|
const isUser = computed(() => props.message.role === 'USER')
|
||||||
const renderedContent = computed(() => marked.parse(props.message.content) as string)
|
const renderedContent = computed(() => marked.parse(props.message.content) as string)
|
||||||
|
|
||||||
|
const textSources = computed(() =>
|
||||||
|
(props.message.sources ?? []).filter((s: ChatSource) => s.type === 'TEXT' || !s.type)
|
||||||
|
)
|
||||||
|
|
||||||
|
const figureSources = computed(() =>
|
||||||
|
(props.message.sources ?? []).filter((s: ChatSource) => s.type === 'FIGURE')
|
||||||
|
)
|
||||||
|
|
||||||
|
function formatFigureType(type: string): string {
|
||||||
|
const labels: Record<string, string> = {
|
||||||
|
ANATOMICAL_DIAGRAM: 'Anatomical Diagram',
|
||||||
|
SURGICAL_PHOTOGRAPH: 'Surgical Photo',
|
||||||
|
MRI_CT_SCAN: 'MRI / CT',
|
||||||
|
TABLE: 'Table',
|
||||||
|
CHART: 'Chart',
|
||||||
|
INTRAOPERATIVE_IMAGE: 'Intraoperative'
|
||||||
|
}
|
||||||
|
return labels[type] ?? type
|
||||||
|
}
|
||||||
|
|
||||||
|
function onImageError(e: Event) {
|
||||||
|
const img = e.target as HTMLImageElement
|
||||||
|
img.alt = 'Image unavailable'
|
||||||
|
img.style.display = 'none'
|
||||||
|
const wrapper = img.parentElement
|
||||||
|
if (wrapper) {
|
||||||
|
wrapper.innerHTML = '<span class="figure-missing">Image unavailable</span>'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
function formatTime(iso: string): string {
|
function formatTime(iso: string): string {
|
||||||
return new Date(iso).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })
|
return new Date(iso).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })
|
||||||
}
|
}
|
||||||
@@ -182,6 +237,55 @@ function formatTime(iso: string): string {
|
|||||||
gap: 0.25rem;
|
gap: 0.25rem;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
.source-item--figure {
|
||||||
|
gap: 0.4rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-chip {
|
||||||
|
display: inline-flex;
|
||||||
|
align-items: center;
|
||||||
|
gap: 0.25rem;
|
||||||
|
border-radius: 4px;
|
||||||
|
padding: 0.2rem 0.5rem;
|
||||||
|
font-size: 0.78rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-chip--text {
|
||||||
|
background: #ebf8ff;
|
||||||
|
border: 1px solid #bee3f8;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-chip--figure {
|
||||||
|
background: #f0fff4;
|
||||||
|
border: 1px solid #9ae6b4;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-icon {
|
||||||
|
font-size: 0.8rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-book-title {
|
||||||
|
color: #2b6cb0;
|
||||||
|
font-weight: 500;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-figure-label {
|
||||||
|
color: #276749;
|
||||||
|
font-weight: 600;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-figure-type {
|
||||||
|
color: #718096;
|
||||||
|
font-size: 0.72rem;
|
||||||
|
background: #e2e8f0;
|
||||||
|
border-radius: 3px;
|
||||||
|
padding: 0 0.3rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.source-page {
|
||||||
|
color: #718096;
|
||||||
|
}
|
||||||
|
|
||||||
.source-chunk {
|
.source-chunk {
|
||||||
font-size: 0.78rem;
|
font-size: 0.78rem;
|
||||||
color: #4a5568;
|
color: #4a5568;
|
||||||
@@ -194,28 +298,28 @@ function formatTime(iso: string): string {
|
|||||||
line-height: 1.5;
|
line-height: 1.5;
|
||||||
}
|
}
|
||||||
|
|
||||||
.source-chip {
|
.source-caption {
|
||||||
display: inline-flex;
|
|
||||||
align-items: center;
|
|
||||||
gap: 0.25rem;
|
|
||||||
background: #ebf8ff;
|
|
||||||
border: 1px solid #bee3f8;
|
|
||||||
border-radius: 4px;
|
|
||||||
padding: 0.2rem 0.5rem;
|
|
||||||
font-size: 0.78rem;
|
font-size: 0.78rem;
|
||||||
|
color: #4a5568;
|
||||||
|
font-style: italic;
|
||||||
}
|
}
|
||||||
|
|
||||||
.source-book-icon {
|
.source-figure-image {
|
||||||
font-size: 0.8rem;
|
max-width: 100%;
|
||||||
}
|
}
|
||||||
|
|
||||||
.source-book-title {
|
.figure-img {
|
||||||
color: #2b6cb0;
|
max-width: 100%;
|
||||||
font-weight: 500;
|
max-height: 300px;
|
||||||
|
border-radius: 6px;
|
||||||
|
border: 1px solid #e2e8f0;
|
||||||
|
object-fit: contain;
|
||||||
}
|
}
|
||||||
|
|
||||||
.source-page {
|
.figure-missing {
|
||||||
color: #718096;
|
font-size: 0.78rem;
|
||||||
|
color: #a0aec0;
|
||||||
|
font-style: italic;
|
||||||
}
|
}
|
||||||
|
|
||||||
.message-timestamp {
|
.message-timestamp {
|
||||||
|
|||||||
@@ -2,11 +2,25 @@ import { defineStore } from 'pinia'
|
|||||||
import { ref } from 'vue'
|
import { ref } from 'vue'
|
||||||
import { api } from '@/services/api'
|
import { api } from '@/services/api'
|
||||||
|
|
||||||
|
export interface ChatSource {
|
||||||
|
type: 'TEXT' | 'FIGURE'
|
||||||
|
bookTitle: string
|
||||||
|
page: number | null
|
||||||
|
// TEXT-specific
|
||||||
|
chunkText?: string
|
||||||
|
// FIGURE-specific
|
||||||
|
figureId?: string
|
||||||
|
label?: string
|
||||||
|
caption?: string
|
||||||
|
figureType?: string
|
||||||
|
imageUrl?: string
|
||||||
|
}
|
||||||
|
|
||||||
export interface ChatMessage {
|
export interface ChatMessage {
|
||||||
id: string
|
id: string
|
||||||
role: 'USER' | 'ASSISTANT'
|
role: 'USER' | 'ASSISTANT'
|
||||||
content: string
|
content: string
|
||||||
sources: Array<{ bookTitle: string; page: number | null; chunkText?: string }>
|
sources: ChatSource[]
|
||||||
createdAt: string
|
createdAt: string
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,73 @@
|
|||||||
|
# Embedding & Retrieval Pipeline Checklist: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Purpose**: Author self-review of embedding pipeline and retrieval requirements quality — validates completeness, clarity, and measurability before implementation tasks are written
|
||||||
|
**Created**: 2026-04-03
|
||||||
|
**Feature**: [spec.md](../spec.md) | [research.md](../research.md) | [data-model.md](../data-model.md)
|
||||||
|
**Focus**: A (Embedding pipeline) + B (Retrieval & ranking) | Depth: Standard | Audience: Author
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Requirement Completeness — Embedding Pipeline
|
||||||
|
|
||||||
|
- [X] CHK001 - Is the definition of "inspect every page" complete — does the spec cover pages that have no extractable content layer (fully scanned/rasterised pages)? Yes [Completeness, Spec §FR-001, Assumption §6]
|
||||||
|
|
||||||
|
- [X] CHK002 - Does FR-002 define what "independently searchable" means in practice — specifically, is it clear that image chunks must be retrievable without a co-located text chunk? [Clarity, Spec §FR-002] - No image should be retrieved along linked text.
|
||||||
|
|
||||||
|
- [X] CHK003 - Is the minimum acceptable quality of the "descriptive textual representation" (FR-003) specified — e.g., must it include structural relationships, labelled regions, or clinical terms — or is any non-empty description sufficient? [Clarity, Spec §FR-003, Gap] - any non-empty description sufficient. Text just below the image should have the correct clinical term.
|
||||||
|
|
||||||
|
- [C] CHK004 - Are the caption-detection rules defined at spec level — specifically, what pattern or signal determines that a piece of text is a caption vs. body text adjacent to an image? [Clarity, Spec §FR-004, Gap] - We assume a text starting with Fig. follewed by number is a text description of a give image.
|
||||||
|
|
||||||
|
- [X] CHK005 - Does FR-004 specify what metadata is stored when a caption is absent — is the caption field omitted, left empty, or populated with a generated substitute? [Completeness, Spec §FR-004] - generated substitute
|
||||||
|
|
||||||
|
- [X] CHK006 - Is the "minimum meaningful-content threshold" (FR-007) quantified in the spec, or is it deferred entirely to implementation? The assumption section says "size threshold determined during implementation" — is this intentional and acceptable at the spec level? [Ambiguity, Spec §FR-007, Assumption §3] - Deferred to implementation
|
||||||
|
|
||||||
|
- [X] CHK007 - Does FR-008 specify the observable outcome of per-page image failures — specifically, is there a requirement that the book's processing status or error log is accessible to the user or admin after partial failure? [Completeness, Spec §FR-008, Gap] online logs
|
||||||
|
|
||||||
|
- [X] CHK008 - Is FR-010 ("MUST NOT degrade accuracy or completeness of text-only embedding") measurable — does the spec define a baseline or acceptance criterion against which degradation can be detected? [Measurability, Spec §FR-010, Gap] no definition
|
||||||
|
|
||||||
|
- [X] CHK009 - Are re-embedding requirements complete — does the spec cover what happens to in-progress queries and cached results while a book is being re-embedded? [Coverage, Assumption §8, Gap] - No need to take that into account.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Requirement Completeness — Retrieval & Ranking
|
||||||
|
|
||||||
|
- [X] CHK010 - Does FR-006 define how image and text chunks are ranked relative to each other — is ranking unified (single score), or are the two modalities ranked independently with separate topK controls? [Clarity, Spec §FR-006, Gap] - independent separated topK
|
||||||
|
|
||||||
|
- [X] CHK011 - Is the relevance threshold for figure retrieval specified — i.e., at what similarity score (or other criterion) should a figure be excluded from results? [Clarity, Spec §FR-006, Gap] not specified
|
||||||
|
|
||||||
|
- [X] CHK012 - Are deduplication rules defined for the case where the same figure appears both in the semantic figure search and the chunk-to-figure reference lookup — which representation wins, or are both included? [Completeness, data-model.md §RetrievalResult, Gap] not specified
|
||||||
|
|
||||||
|
- [X] CHK013 - Is the requirement for parent section context expansion in the spec — specifically, is there a requirement that the LLM receives the full section text (not just the chunk) when a text chunk is retrieved? [Gap, research.md §Decision 1] - the LLM should receive the full section to have maximum context.
|
||||||
|
|
||||||
|
- [X] CHK014 - Does the spec define the required structure of the LLM prompt when both text context and figures are present — or is prompt design left entirely to implementation? [Completeness, Gap] - Left to implementation
|
||||||
|
|
||||||
|
- [X] CHK015 - Is SC-002 ("70% recall on image queries") sufficient as a measurability criterion — is the test set composition (10 queries) and evaluation method documented, or does it rely on an undefined manual process? [Measurability, Spec §SC-002] - Manual process.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scenario Coverage — Edge & Exception Cases
|
||||||
|
|
||||||
|
- [X] CHK016 - Does the spec address the scenario where a query is relevant to a book section that has figures but none of those figures rank above the retrieval threshold — is the expected fallback behaviour defined? [Coverage, Edge Case, Gap] - The figure should in this case be retrieved and shon to the user.
|
||||||
|
|
||||||
|
- [X] CHK017 - Is the scenario of a figure retrieved in search results but whose image file is missing from the file store covered — what should the system return to the user in that case? [Coverage, Exception Flow, Gap] - missing image error, shown in the front as a broken image link.
|
||||||
|
|
||||||
|
- [X] CHK018 - Are requirements defined for multi-image pages where images have conflicting captions or share a single composite caption — which image gets the caption, or is it duplicated? [Coverage, Spec §FR-004, Edge Case] - this case not exist.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Consistency & Alignment
|
||||||
|
|
||||||
|
- [X] CHK019 - Are the metadata fields required by FR-004 and FR-005 fully consistent with the metadata schema defined in data-model.md — specifically, do the mandatory fields in the spec match the `type`, `section_id`, and `section_title` fields in the data model? [Consistency, Spec §FR-004, data-model.md §Vector Store Documents] - Left to implementation
|
||||||
|
|
||||||
|
- [X] CHK020 - Is SC-003 ("processing time ≤ 3× baseline") consistent with FR-003 — if description generation requires a vision model call per image, is the 3× cap realistic for a 500-page book with dense figures, and is this assumption documented? [Consistency, Spec §SC-003, Assumption §3, Gap] - not documented
|
||||||
|
|
||||||
|
- [X] CHK021 - Does the spec's description of citation display (FR-009) align with the `sources` format change documented in contracts/api.md — are the fields the spec says must be "distinct" actually represented distinctly in the API response? [Consistency, Spec §FR-009, contracts/api.md §4] - A section with image-source should be displayed in the front. Text source and image-source are distinct
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Items marked `[Gap]` indicate requirements that appear absent or deferred; resolve before generating tasks
|
||||||
|
- Items marked `[Ambiguity]` require a clearer definition in the spec before implementation starts
|
||||||
|
- Items marked `[Consistency]` should be cross-checked between spec.md, data-model.md, and contracts/api.md
|
||||||
|
- Mark items `[x]` when resolved; add inline notes with the resolution for traceability
|
||||||
@@ -0,0 +1,34 @@
|
|||||||
|
# Specification Quality Checklist: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
||||||
|
**Created**: 2026-04-03
|
||||||
|
**Feature**: [spec.md](../spec.md)
|
||||||
|
|
||||||
|
## Content Quality
|
||||||
|
|
||||||
|
- [x] No implementation details (languages, frameworks, APIs)
|
||||||
|
- [x] Focused on user value and business needs
|
||||||
|
- [x] Written for non-technical stakeholders
|
||||||
|
- [x] All mandatory sections completed
|
||||||
|
|
||||||
|
## Requirement Completeness
|
||||||
|
|
||||||
|
- [x] No [NEEDS CLARIFICATION] markers remain
|
||||||
|
- [x] Requirements are testable and unambiguous
|
||||||
|
- [x] Success criteria are measurable
|
||||||
|
- [x] Success criteria are technology-agnostic (no implementation details)
|
||||||
|
- [x] All acceptance scenarios are defined
|
||||||
|
- [x] Edge cases are identified
|
||||||
|
- [x] Scope is clearly bounded
|
||||||
|
- [x] Dependencies and assumptions identified
|
||||||
|
|
||||||
|
## Feature Readiness
|
||||||
|
|
||||||
|
- [x] All functional requirements have clear acceptance criteria
|
||||||
|
- [x] User scenarios cover primary flows
|
||||||
|
- [x] Feature meets measurable outcomes defined in Success Criteria
|
||||||
|
- [x] No implementation details leak into specification
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- All items pass. Spec is ready for `/speckit.clarify` or `/speckit.plan`.
|
||||||
@@ -0,0 +1,172 @@
|
|||||||
|
# API Contracts: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03
|
||||||
|
**Base path**: `/api/v1`
|
||||||
|
**Auth**: HTTP Basic (existing)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## New / Changed Endpoints
|
||||||
|
|
||||||
|
### 1. Re-embed a book (new)
|
||||||
|
|
||||||
|
Triggers a full re-embedding of an already-processed book, replacing all existing chunks and
|
||||||
|
figures with the new image-aware pipeline output. Safe to call on books previously embedded
|
||||||
|
by feature 001.
|
||||||
|
|
||||||
|
```
|
||||||
|
POST /api/v1/books/{id}/reembed
|
||||||
|
```
|
||||||
|
|
||||||
|
**Path parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `id` | UUID | Book ID |
|
||||||
|
|
||||||
|
**Response** `202 Accepted`
|
||||||
|
|
||||||
|
```json
|
||||||
|
{ "bookId": "uuid", "status": "PROCESSING" }
|
||||||
|
```
|
||||||
|
|
||||||
|
**Error responses**
|
||||||
|
|
||||||
|
| Status | Condition |
|
||||||
|
|--------|-----------|
|
||||||
|
| 404 | Book not found |
|
||||||
|
| 409 | Book already in PROCESSING state |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2. Get figures for a book (new)
|
||||||
|
|
||||||
|
Returns the list of extracted figures for a book, including their type, caption, and image URL.
|
||||||
|
Used by the frontend to display a figure gallery or inline figures in chat responses.
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/books/{id}/figures
|
||||||
|
```
|
||||||
|
|
||||||
|
**Path parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `id` | UUID | Book ID |
|
||||||
|
|
||||||
|
**Response** `200 OK`
|
||||||
|
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"figureId": "youmans-7ed-fig-12-4",
|
||||||
|
"label": "Fig. 12-4",
|
||||||
|
"caption": "Coronal cross-section of the cavernous sinus showing cranial nerve relationships",
|
||||||
|
"figureType": "ANATOMICAL_DIAGRAM",
|
||||||
|
"page": 184,
|
||||||
|
"imageUrl": "/api/v1/figures/550e8400-e29b-41d4-a716-446655440000/youmans-7ed-fig-12-4.png",
|
||||||
|
"sectionId": "youmans-7ed-ch12-s2-3",
|
||||||
|
"sectionTitle": "Cavernous Sinus"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Error responses**
|
||||||
|
|
||||||
|
| Status | Condition |
|
||||||
|
|--------|-----------|
|
||||||
|
| 404 | Book not found |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3. Serve figure image (new)
|
||||||
|
|
||||||
|
Serves the extracted figure image file. Mounted as a static resource from the file store.
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/v1/figures/{bookId}/{filename}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Path parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `bookId` | UUID | Book ID |
|
||||||
|
| `filename` | string | Image filename (e.g. `youmans-7ed-fig-12-4.png`) |
|
||||||
|
|
||||||
|
**Response** `200 OK` — binary PNG
|
||||||
|
**Content-Type**: `image/png`
|
||||||
|
|
||||||
|
**Error responses**
|
||||||
|
|
||||||
|
| Status | Condition |
|
||||||
|
|--------|-----------|
|
||||||
|
| 404 | Image file not found |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 4. Chat message response — extended source format (changed)
|
||||||
|
|
||||||
|
The existing `POST /api/v1/chat/sessions/{id}/messages` endpoint is unchanged in its request
|
||||||
|
format. The response `sources` field is extended to include figure references.
|
||||||
|
|
||||||
|
**Existing request** (unchanged):
|
||||||
|
|
||||||
|
```json
|
||||||
|
{ "content": "Describe the anatomy of the cavernous sinus" }
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response** `200 OK` — extended `sources`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"id": "uuid",
|
||||||
|
"role": "ASSISTANT",
|
||||||
|
"content": "The cavernous sinus is ... [Fig. 12-4, p.184] ...",
|
||||||
|
"sources": [
|
||||||
|
{
|
||||||
|
"type": "TEXT",
|
||||||
|
"bookTitle": "Youmans and Winn Neurological Surgery, 7th Ed.",
|
||||||
|
"page": 184,
|
||||||
|
"chunkText": "The cavernous sinus contains ..."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"type": "FIGURE",
|
||||||
|
"bookTitle": "Youmans and Winn Neurological Surgery, 7th Ed.",
|
||||||
|
"page": 184,
|
||||||
|
"figureId": "youmans-7ed-fig-12-4",
|
||||||
|
"label": "Fig. 12-4",
|
||||||
|
"caption": "Coronal cross-section of the cavernous sinus ...",
|
||||||
|
"figureType": "ANATOMICAL_DIAGRAM",
|
||||||
|
"imageUrl": "/api/v1/figures/550e8400-e29b-41d4-a716-446655440000/youmans-7ed-fig-12-4.png"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"createdAt": "2026-04-03T12:00:00Z"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Changed fields in `sources` array**:
|
||||||
|
|
||||||
|
| Field | Old | New |
|
||||||
|
|-------|-----|-----|
|
||||||
|
| `type` | absent | `"TEXT"` or `"FIGURE"` |
|
||||||
|
| `figureId` | absent | figure ID string (FIGURE type only) |
|
||||||
|
| `label` | absent | caption label (FIGURE type only) |
|
||||||
|
| `caption` | absent | full caption (FIGURE type only) |
|
||||||
|
| `figureType` | absent | enum name (FIGURE type only) |
|
||||||
|
| `imageUrl` | absent | image URL (FIGURE type only) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Unchanged Endpoints
|
||||||
|
|
||||||
|
All endpoints from feature 001 remain at their existing paths with no breaking changes:
|
||||||
|
|
||||||
|
- `POST /api/v1/books/upload`
|
||||||
|
- `GET /api/v1/books`
|
||||||
|
- `DELETE /api/v1/books/{id}`
|
||||||
|
- `GET /api/v1/topics`
|
||||||
|
- `GET /api/v1/topics/{id}/summary`
|
||||||
|
- `POST /api/v1/chat/sessions`
|
||||||
|
- `GET /api/v1/chat/sessions/{id}/messages`
|
||||||
|
- `DELETE /api/v1/chat/sessions/{id}`
|
||||||
@@ -0,0 +1,305 @@
|
|||||||
|
# Data Model: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Three storage tiers work in concert:
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────────────────────────────────┐
|
||||||
|
│ PDF Upload │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ Parsing Pipeline │
|
||||||
|
│ │ │ │
|
||||||
|
│ ▼ ▼ │
|
||||||
|
│ Postgres (source of truth) pgvector (search index) │
|
||||||
|
│ - book - vector_store (text chunks) │
|
||||||
|
│ - chapter - vector_store (figure captions) │
|
||||||
|
│ - section (+ fullText) File Store (images) │
|
||||||
|
│ - figure (metadata) - /uploads/figures/{bookId}/*.png │
|
||||||
|
│ - chunk_figure_refs │
|
||||||
|
└──────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Postgres Schema
|
||||||
|
|
||||||
|
### Existing tables (unchanged)
|
||||||
|
|
||||||
|
- `book` — status, metadata, page count (V1)
|
||||||
|
- `chat_session`, `message` — conversation (V1)
|
||||||
|
- `vector_store` — managed by Spring AI pgvector starter (V2)
|
||||||
|
- `topic` — predefined topics (V3)
|
||||||
|
|
||||||
|
### New tables (Flyway V4)
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- V4: Document hierarchy
|
||||||
|
|
||||||
|
CREATE TABLE chapter (
|
||||||
|
id VARCHAR(200) PRIMARY KEY, -- "{bookId}-ch{N}"
|
||||||
|
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||||
|
number INT NOT NULL,
|
||||||
|
title VARCHAR(500),
|
||||||
|
page_start INT,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE section (
|
||||||
|
id VARCHAR(200) PRIMARY KEY, -- "{bookId}-ch{N}-s{X}-{Y}"
|
||||||
|
chapter_id VARCHAR(200) NOT NULL REFERENCES chapter(id) ON DELETE CASCADE,
|
||||||
|
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||||
|
number VARCHAR(50), -- "2.3" or "12.2.3"
|
||||||
|
title VARCHAR(500),
|
||||||
|
page_start INT NOT NULL,
|
||||||
|
page_end INT NOT NULL,
|
||||||
|
full_text TEXT NOT NULL, -- NOT in vector store
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_section_book ON section(book_id);
|
||||||
|
CREATE INDEX idx_section_chapter ON section(chapter_id);
|
||||||
|
```
|
||||||
|
|
||||||
|
### New tables (Flyway V5)
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- V5: Figures and chunk→figure links
|
||||||
|
|
||||||
|
CREATE TABLE figure (
|
||||||
|
id VARCHAR(200) PRIMARY KEY, -- "{bookId}-fig-{label}"
|
||||||
|
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||||
|
section_id VARCHAR(200) REFERENCES section(id) ON DELETE SET NULL,
|
||||||
|
chapter_id VARCHAR(200) REFERENCES chapter(id) ON DELETE SET NULL,
|
||||||
|
label VARCHAR(100), -- "Fig. 12-4"
|
||||||
|
caption TEXT,
|
||||||
|
figure_type VARCHAR(50) NOT NULL, -- FigureType enum name
|
||||||
|
page INT NOT NULL,
|
||||||
|
image_path VARCHAR(1000) NOT NULL, -- relative path on disk
|
||||||
|
caption_embedding_id UUID, -- ID in vector_store
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE chunk_figure_ref (
|
||||||
|
chunk_id UUID NOT NULL, -- vector_store document ID
|
||||||
|
figure_id VARCHAR(200) NOT NULL REFERENCES figure(id) ON DELETE CASCADE,
|
||||||
|
mention_page INT,
|
||||||
|
PRIMARY KEY (chunk_id, figure_id)
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_figure_book ON figure(book_id);
|
||||||
|
CREATE INDEX idx_cfr_chunk ON chunk_figure_ref(chunk_id);
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Java Domain Records
|
||||||
|
|
||||||
|
### Document hierarchy (new package `com.aiteacher.document`)
|
||||||
|
|
||||||
|
```java
|
||||||
|
// Root — in-memory only, not a JPA entity
|
||||||
|
public record BookNode(
|
||||||
|
String bookId,
|
||||||
|
String title,
|
||||||
|
String isbn,
|
||||||
|
String edition,
|
||||||
|
List<String> authors,
|
||||||
|
List<ChapterNode> chapters
|
||||||
|
) {}
|
||||||
|
|
||||||
|
// Chapter — maps to `chapter` table
|
||||||
|
public record ChapterNode(
|
||||||
|
String chapterId,
|
||||||
|
String bookId,
|
||||||
|
int number,
|
||||||
|
String title,
|
||||||
|
int pageStart,
|
||||||
|
List<SectionNode> sections
|
||||||
|
) {}
|
||||||
|
|
||||||
|
// Section — maps to `section` table; fullText stays in Postgres
|
||||||
|
public record SectionNode(
|
||||||
|
String sectionId,
|
||||||
|
String chapterId,
|
||||||
|
String bookId,
|
||||||
|
String number,
|
||||||
|
String title,
|
||||||
|
int pageStart,
|
||||||
|
int pageEnd,
|
||||||
|
String fullText,
|
||||||
|
List<TextChunkNode> chunks,
|
||||||
|
List<FigureNode> figures
|
||||||
|
) {}
|
||||||
|
|
||||||
|
// Text chunk — embedded into vector_store; references its parent section
|
||||||
|
public record TextChunkNode(
|
||||||
|
String chunkId, // UUID → becomes vector_store document ID
|
||||||
|
String sectionId,
|
||||||
|
String chapterId,
|
||||||
|
String bookId,
|
||||||
|
String text,
|
||||||
|
int chunkIndex,
|
||||||
|
int totalChunksInSection,
|
||||||
|
int pageStart,
|
||||||
|
int pageEnd,
|
||||||
|
Map<String, Object> metadata // flattened for Spring AI filtering
|
||||||
|
) {
|
||||||
|
public Map<String, Object> toMetadata() {
|
||||||
|
return Map.of(
|
||||||
|
"type", "TEXT",
|
||||||
|
"book_id", bookId,
|
||||||
|
"chapter_id", chapterId,
|
||||||
|
"section_id", sectionId,
|
||||||
|
"section_title", /* from parent SectionNode */,
|
||||||
|
"page_start", pageStart,
|
||||||
|
"page_end", pageEnd,
|
||||||
|
"chunk_index", chunkIndex,
|
||||||
|
"total_chunks", totalChunksInSection
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Figure — maps to `figure` table; caption embedded into vector_store
|
||||||
|
public record FigureNode(
|
||||||
|
String figureId,
|
||||||
|
String sectionId,
|
||||||
|
String chapterId,
|
||||||
|
String bookId,
|
||||||
|
String label, // "Fig. 12-4"
|
||||||
|
String caption,
|
||||||
|
FigureType type,
|
||||||
|
int page,
|
||||||
|
String imagePath, // relative: "figures/{bookId}/{figureId}.png"
|
||||||
|
UUID captionEmbeddingId // ID in vector_store
|
||||||
|
) {}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Figure type enum
|
||||||
|
|
||||||
|
```java
|
||||||
|
public enum FigureType {
|
||||||
|
ANATOMICAL_DIAGRAM,
|
||||||
|
SURGICAL_PHOTOGRAPH,
|
||||||
|
MRI_CT_SCAN,
|
||||||
|
TABLE,
|
||||||
|
CHART,
|
||||||
|
INTRAOPERATIVE_IMAGE
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Classification heuristic (applied to caption + surrounding text):
|
||||||
|
|
||||||
|
| Keyword(s) | FigureType |
|
||||||
|
|-----------|-----------|
|
||||||
|
| `MRI`, `CT`, `magnetic`, `resonance`, `tomography` | `MRI_CT_SCAN` |
|
||||||
|
| `intraoperative`, `intra-op` | `INTRAOPERATIVE_IMAGE` |
|
||||||
|
| `table`, `Table` (at line start) | `TABLE` |
|
||||||
|
| `chart`, `graph`, `histogram` | `CHART` |
|
||||||
|
| `photograph`, `photo` | `SURGICAL_PHOTOGRAPH` |
|
||||||
|
| (default) | `ANATOMICAL_DIAGRAM` |
|
||||||
|
|
||||||
|
### Chunk–figure join record
|
||||||
|
|
||||||
|
```java
|
||||||
|
// Maps to `chunk_figure_ref` table
|
||||||
|
public record ChunkFigureRef(
|
||||||
|
UUID chunkId,
|
||||||
|
String figureId,
|
||||||
|
int mentionPage
|
||||||
|
) {}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Vector Store Documents
|
||||||
|
|
||||||
|
All documents in `vector_store` carry a `metadata` JSON column with a `type` field for filtering.
|
||||||
|
|
||||||
|
### Text chunk document
|
||||||
|
|
||||||
|
| Field | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| `content` | chunk text (400–600 tokens) |
|
||||||
|
| `metadata.type` | `"TEXT"` |
|
||||||
|
| `metadata.book_id` | book UUID |
|
||||||
|
| `metadata.book_title` | book title string |
|
||||||
|
| `metadata.chapter_id` | chapter ID string |
|
||||||
|
| `metadata.section_id` | section ID string |
|
||||||
|
| `metadata.section_title` | section title string |
|
||||||
|
| `metadata.page_start` | int |
|
||||||
|
| `metadata.page_end` | int |
|
||||||
|
| `metadata.chunk_index` | int (0-based) |
|
||||||
|
| `metadata.total_chunks` | int |
|
||||||
|
|
||||||
|
### Figure caption document
|
||||||
|
|
||||||
|
| Field | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| `content` | vision-generated description + caption text |
|
||||||
|
| `metadata.type` | `"FIGURE"` |
|
||||||
|
| `metadata.book_id` | book UUID |
|
||||||
|
| `metadata.book_title` | book title string |
|
||||||
|
| `metadata.chapter_id` | chapter ID string |
|
||||||
|
| `metadata.section_id` | section ID string |
|
||||||
|
| `metadata.figure_id` | figure ID string |
|
||||||
|
| `metadata.figure_type` | enum name string |
|
||||||
|
| `metadata.image_path` | relative file path |
|
||||||
|
| `metadata.label` | caption label e.g. `"Fig. 12-4"` |
|
||||||
|
| `metadata.page` | int |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Store Layout
|
||||||
|
|
||||||
|
```
|
||||||
|
uploads/
|
||||||
|
└── figures/
|
||||||
|
└── {bookId}/
|
||||||
|
├── {figureId}.png
|
||||||
|
└── ...
|
||||||
|
```
|
||||||
|
|
||||||
|
- Base path configurable via `app.figure-storage.base-path` (default: `./uploads`)
|
||||||
|
- Files are served via `GET /api/v1/figures/{bookId}/{filename}` (static resource mapping)
|
||||||
|
- Gitignored; not version-controlled
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## State Transitions
|
||||||
|
|
||||||
|
Book processing extends the existing `BookStatus` state machine:
|
||||||
|
|
||||||
|
```
|
||||||
|
PENDING → PROCESSING → READY
|
||||||
|
↘ FAILED
|
||||||
|
```
|
||||||
|
|
||||||
|
During `PROCESSING`:
|
||||||
|
1. Parse PDF structure → extract chapters/sections → persist to Postgres
|
||||||
|
2. Split sections into text chunks → embed → write to vector_store
|
||||||
|
3. Extract images per page → filter by min size → save PNG → generate vision description → embed caption → write figure to Postgres + vector_store
|
||||||
|
4. Write chunk_figure_refs for all detected figure references in text
|
||||||
|
|
||||||
|
Failure at step 3 (individual page) → log + skip that page's images; continue.
|
||||||
|
Failure at any other step → set `BookStatus.FAILED`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Retrieval Result Structure
|
||||||
|
|
||||||
|
```java
|
||||||
|
public record RetrievalResult(
|
||||||
|
List<SectionNode> parentSections, // expanded full-text context
|
||||||
|
List<Document> figureVectorHits, // semantic figure matches
|
||||||
|
List<FigureNode> linkedFigures // figures explicitly referenced in text chunks
|
||||||
|
) {}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `NeurosurgeryRetriever` service deduplicates figures across both lists before passing
|
||||||
|
the result to the LLM prompt builder.
|
||||||
@@ -0,0 +1,105 @@
|
|||||||
|
# Implementation Plan: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03 | **Spec**: [spec.md](spec.md)
|
||||||
|
**Input**: Feature specification from `/specs/002-image-aware-embedding/spec.md`
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Enhance the book embedding pipeline to extract images from every PDF page, generate descriptive
|
||||||
|
text for each image, and store all content (text chunks + figure captions) with rich, consistent
|
||||||
|
metadata in the vector store. A new document hierarchy (Book → Chapter → Section → TextChunk +
|
||||||
|
Figure) is introduced. Postgres holds the full-text sections and figure metadata; the vector
|
||||||
|
store holds chunk and figure caption embeddings; the local file store holds extracted image files.
|
||||||
|
At query time, both the text-chunk store and figure-caption store are searched in parallel and
|
||||||
|
results are merged before being sent to the LLM.
|
||||||
|
|
||||||
|
## Technical Context
|
||||||
|
|
||||||
|
**Language/Version**: Java 25 (backend), TypeScript / Node 20 (frontend)
|
||||||
|
**Primary Dependencies**: Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (embeddings + chat), PDFBox (via Spring AI PDF reader dependency)
|
||||||
|
**Storage**: PostgreSQL (JPA + Flyway), pgvector (Spring AI `VectorStore`), local file system (extracted images — `/uploads/figures/`)
|
||||||
|
**Testing**: Spring Boot Test, JUnit 5, Mockito
|
||||||
|
**Target Platform**: Linux server (Docker Compose)
|
||||||
|
**Project Type**: Web application — backend REST API + Vue 3 frontend
|
||||||
|
**Performance Goals**: Full book (up to 500 pages with images) processed in ≤ 30 minutes; query response unchanged from existing baseline
|
||||||
|
**Constraints**: No new deployable units; all changes within the existing `backend/` module; image storage on local disk (S3 migration is a future concern, behind an interface)
|
||||||
|
**Scale/Scope**: POC — <10 concurrent users; single shared book library
|
||||||
|
|
||||||
|
## Constitution Check
|
||||||
|
|
||||||
|
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
||||||
|
|
||||||
|
| Principle | Status | Notes |
|
||||||
|
|-----------|--------|-------|
|
||||||
|
| I — KISS | ⚠️ Justified violation — see Complexity Tracking | Hierarchical model + dual search adds complexity; justified by precision requirement |
|
||||||
|
| II — Easy to Change | ✅ | Figure storage wrapped behind `FigureStorageService` interface; can swap local disk for S3 |
|
||||||
|
| III — Web-First | ✅ | All new capabilities exposed via existing REST API; no new deployable units |
|
||||||
|
| IV — Docs as Architecture | ⚠️ Required | README Mermaid diagram MUST be updated in this PR to show new storage tiers |
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
### Documentation (this feature)
|
||||||
|
|
||||||
|
```text
|
||||||
|
specs/002-image-aware-embedding/
|
||||||
|
├── plan.md # This file
|
||||||
|
├── research.md # Phase 0 output
|
||||||
|
├── data-model.md # Phase 1 output
|
||||||
|
├── quickstart.md # Phase 1 output
|
||||||
|
├── contracts/ # Phase 1 output
|
||||||
|
└── tasks.md # Phase 2 output (/speckit.tasks)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Source Code (repository root)
|
||||||
|
|
||||||
|
```text
|
||||||
|
backend/
|
||||||
|
├── src/main/java/com/aiteacher/
|
||||||
|
│ ├── book/
|
||||||
|
│ │ ├── Book.java (existing)
|
||||||
|
│ │ ├── BookController.java (existing)
|
||||||
|
│ │ ├── BookService.java (existing)
|
||||||
|
│ │ ├── BookRepository.java (existing)
|
||||||
|
│ │ ├── BookStatus.java (existing)
|
||||||
|
│ │ ├── BookEmbeddingService.java (existing — enhanced)
|
||||||
|
│ │ └── NoKnowledgeSourceException.java (existing)
|
||||||
|
│ ├── document/ (new package)
|
||||||
|
│ │ ├── BookNode.java
|
||||||
|
│ │ ├── ChapterNode.java
|
||||||
|
│ │ ├── SectionNode.java
|
||||||
|
│ │ ├── SectionRepository.java
|
||||||
|
│ │ ├── TextChunkNode.java
|
||||||
|
│ │ ├── FigureNode.java
|
||||||
|
│ │ ├── FigureRepository.java
|
||||||
|
│ │ ├── FigureType.java
|
||||||
|
│ │ ├── ChunkFigureRef.java
|
||||||
|
│ │ └── ChunkFigureRefRepository.java
|
||||||
|
│ ├── figure/ (new package)
|
||||||
|
│ │ ├── FigureStorageService.java (interface)
|
||||||
|
│ │ └── LocalFigureStorageService.java (implementation)
|
||||||
|
│ ├── retrieval/ (new package)
|
||||||
|
│ │ └── NeurosurgeryRetriever.java
|
||||||
|
│ ├── chat/
|
||||||
|
│ │ └── ChatService.java (updated — uses NeurosurgeryRetriever)
|
||||||
|
│ └── config/
|
||||||
|
│ └── FigureStorageConfig.java (new — configures upload dir)
|
||||||
|
└── src/main/resources/
|
||||||
|
└── db/migration/
|
||||||
|
├── V4__document_hierarchy.sql (new)
|
||||||
|
└── V5__figures_and_refs.sql (new)
|
||||||
|
|
||||||
|
uploads/
|
||||||
|
└── figures/ (runtime — extracted images; gitignored)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Structure Decision**: Option 2 (Web Application) confirmed. All backend changes stay within
|
||||||
|
`backend/`. Two new packages (`document/`, `retrieval/`) plus one interface package (`figure/`)
|
||||||
|
keep concerns separated without adding a deployable unit.
|
||||||
|
|
||||||
|
## Complexity Tracking
|
||||||
|
|
||||||
|
| Violation | Why Needed | Simpler Alternative Rejected Because |
|
||||||
|
|-----------|------------|-------------------------------------|
|
||||||
|
| Document hierarchy (BookNode → ChapterNode → SectionNode) | Parent-child retrieval: chunks reference their parent section so the LLM receives full section context, not just the matching fragment. This is the established solution for RAG precision. | Flat page-per-doc model (current) loses inter-sentence context; chunk-only retrieval produces incomplete answers for multi-paragraph clinical questions |
|
||||||
|
| Dual vector search (text chunks + figure captions) | Figure captions must be independently searchable — a query about "cavernous sinus anatomy" must surface the diagram even if no text chunk scores highly | Single vector store search would miss figures whose captions don't happen to be the highest-similarity hit; this is the core deliverable of the feature |
|
||||||
|
| Third storage tier (local file store for images) | Extracted images cannot live in Postgres (binary blobs degrade query performance) or the vector store (only vectors). A file-per-image approach is standard. | Storing images as base64 in Postgres JSONB would bloat the DB and complicate backup/restore; the `FigureStorageService` interface keeps the implementation swappable |
|
||||||
@@ -0,0 +1,86 @@
|
|||||||
|
# Quickstart: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- Docker Compose running (PostgreSQL + pgvector)
|
||||||
|
- OpenAI API key set in `backend/src/main/resources/application.properties` or as env var `OPENAI_API_KEY`
|
||||||
|
- Java 25 + Maven on PATH
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## New Configuration
|
||||||
|
|
||||||
|
Add to `backend/src/main/resources/application.properties`:
|
||||||
|
|
||||||
|
```properties
|
||||||
|
# Figure storage
|
||||||
|
app.figure-storage.base-path=./uploads
|
||||||
|
app.figure-storage.min-image-size-px=100
|
||||||
|
```
|
||||||
|
|
||||||
|
The `uploads/figures/` directory is created automatically on first use. Add it to `.gitignore`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Migration
|
||||||
|
|
||||||
|
Two new Flyway migrations run automatically on startup:
|
||||||
|
|
||||||
|
- `V4__document_hierarchy.sql` — adds `chapter` and `section` tables
|
||||||
|
- `V5__figures_and_refs.sql` — adds `figure` and `chunk_figure_ref` tables
|
||||||
|
|
||||||
|
No manual DB setup needed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Re-embedding Existing Books
|
||||||
|
|
||||||
|
Books embedded by feature 001 (text-only) remain functional for text queries. To add image
|
||||||
|
support, trigger a re-embed:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:8080/api/v1/books/{bookId}/reembed \
|
||||||
|
-u admin:password
|
||||||
|
```
|
||||||
|
|
||||||
|
The book transitions to `PROCESSING`, old chunks and figures are deleted, and the new
|
||||||
|
image-aware pipeline runs. Status can be polled via `GET /api/v1/books`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Verifying Image Extraction
|
||||||
|
|
||||||
|
1. Upload a PDF with diagrams: `POST /api/v1/books/upload`
|
||||||
|
2. Wait for `status: "READY"` via `GET /api/v1/books`
|
||||||
|
3. List figures: `GET /api/v1/books/{id}/figures` — should return at least one entry per image page
|
||||||
|
4. Ask a diagram-specific question in chat — response `sources` should include a `type: "FIGURE"` entry
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Frontend: Rendering Inline Figures
|
||||||
|
|
||||||
|
The assistant message `content` field will contain figure references in the format
|
||||||
|
`[Fig. 12-4, p.184]`. The frontend should:
|
||||||
|
|
||||||
|
1. Parse `[Fig. X, p.N]` patterns in assistant message text
|
||||||
|
2. Look up the matching entry in `sources` where `type === "FIGURE"`
|
||||||
|
3. Render the figure inline using the `imageUrl` field
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Running Tests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd backend
|
||||||
|
mvn test
|
||||||
|
```
|
||||||
|
|
||||||
|
Key new test classes:
|
||||||
|
- `FigureExtractionServiceTest` — unit tests for image extraction and classification
|
||||||
|
- `NeurosurgeryRetrieverTest` — unit tests for dual-search merge and deduplication
|
||||||
|
- `BookEmbeddingServiceIntegrationTest` — integration test: upload PDF with known figures,
|
||||||
|
verify figures appear in `GET /api/v1/books/{id}/figures`
|
||||||
@@ -0,0 +1,188 @@
|
|||||||
|
# Research: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03
|
||||||
|
|
||||||
|
This document resolves all technical unknowns identified during planning. The primary source for
|
||||||
|
decisions is the detailed architecture provided directly by the project owner, supplemented by
|
||||||
|
Spring AI 2.0.0-M4 API specifics.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 1: Document Hierarchy Model
|
||||||
|
|
||||||
|
**Decision**: Adopt a four-level hierarchy — `BookNode` → `ChapterNode` → `SectionNode` →
|
||||||
|
`TextChunkNode` + `FigureNode`. The `SectionNode` is the pivotal unit: it holds the full section
|
||||||
|
text in Postgres and is used for parent-child context expansion at retrieval time.
|
||||||
|
|
||||||
|
**Rationale**: A flat page-per-document model (current implementation) loses structural context.
|
||||||
|
When a user asks a multi-faceted clinical question, the LLM needs the surrounding section text,
|
||||||
|
not just the matching fragment. Parent-child retrieval — where chunks point to their parent
|
||||||
|
section — is the established pattern for RAG precision. The hierarchy also makes figure-to-section
|
||||||
|
association explicit and queryable.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Keep flat page model, add metadata only → rejected: insufficient for precise citation and
|
||||||
|
context expansion
|
||||||
|
- Chapter-level retrieval (coarser than section) → rejected: too much irrelevant context sent
|
||||||
|
to LLM; cost and latency increase
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 2: Image Extraction Strategy
|
||||||
|
|
||||||
|
**Decision**: Use PDFBox (already on classpath via `spring-ai-pdf-document-reader`) to extract
|
||||||
|
images per page. Each image is tagged with `page`, `figure_id` (derived from caption, e.g.
|
||||||
|
"Fig. 12-4"), and the parent `sectionId`. Images are saved to local disk under
|
||||||
|
`/uploads/figures/{bookId}/`.
|
||||||
|
|
||||||
|
**Rationale**: PDFBox is already present (Spring AI bundles it). No new dependency needed.
|
||||||
|
Per-page extraction ensures every image is captured regardless of PDF structure.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- iText / iText7 → additional commercial dependency; overkill for extraction
|
||||||
|
- Screenshot each page as PNG, then OCR → far slower; loses vector quality
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 3: Figure Content Representation
|
||||||
|
|
||||||
|
**Decision**: Generate a textual description of each extracted image using the OpenAI vision
|
||||||
|
model (GPT-4o). This description becomes the `content` field of the figure's vector store
|
||||||
|
document. The figure caption (parsed from the surrounding text) is also included to maximise
|
||||||
|
retrieval signal.
|
||||||
|
|
||||||
|
**Rationale**: Caption-only embedding would miss figures with no caption or with sparse labels.
|
||||||
|
Vision-generated descriptions produce richer semantic content (anatomy terms, structural
|
||||||
|
relationships) that matches clinical queries. The OpenAI client already in use supports image
|
||||||
|
inputs; no additional dependency is required.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Caption-only embedding → insufficient when captions are absent or terse (common in textbooks)
|
||||||
|
- Local vision model (LLaVA) → requires self-hosting; out of scope for POC
|
||||||
|
- OCR only → extracts text visible in image but misses non-text visual content (diagrams, MRI)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 4: Dual Vector Search
|
||||||
|
|
||||||
|
**Decision**: At query time, run two parallel similarity searches:
|
||||||
|
1. Text chunk search (filtered by `type = "TEXT"` and `book_id`)
|
||||||
|
2. Figure caption search (filtered by `type = "FIGURE"` and `book_id`)
|
||||||
|
|
||||||
|
Results are merged and deduplicated. The LLM prompt receives the expanded parent section text
|
||||||
|
plus a structured figure reference list.
|
||||||
|
|
||||||
|
**Rationale**: A single search would rank text and figures against each other; figures with
|
||||||
|
terse captions would systematically lose to text chunks. Separate searches with independent
|
||||||
|
`topK` allow tuning each modality independently.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Single search, filter by relevance score → figure captions score lower than text; figures
|
||||||
|
are systematically under-retrieved
|
||||||
|
- Post-process text results to look up linked figures only → misses figures that are relevant
|
||||||
|
to the query but not explicitly referenced in the retrieved text chunks
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 5: Chunk-to-Figure Linking
|
||||||
|
|
||||||
|
**Decision**: During text parsing, whenever a pattern matching `Fig.\s+\d+[\-\.]\d+` or
|
||||||
|
`Figure\s+\d+[\-\.]\d+` is found in a chunk, insert a row into the `chunk_figure_refs` table
|
||||||
|
linking `chunkId` → `figureId`. At retrieval time, after text chunks are retrieved, their
|
||||||
|
associated figures are fetched from this table and added to the LLM prompt.
|
||||||
|
|
||||||
|
**Rationale**: Explicit linking ensures that when a text chunk is retrieved, its referenced
|
||||||
|
figures are always surfaced — even if the figure's caption did not score highly in the vector
|
||||||
|
search. This is the higher-recall path; dual search (Decision 4) is the higher-precision path.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Rely entirely on dual vector search → may miss figures referenced in retrieved text but
|
||||||
|
scoring below the topK threshold in the figure search
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 6: Image Storage
|
||||||
|
|
||||||
|
**Decision**: Extracted images are saved as PNG files to a local directory
|
||||||
|
(`${app.figure-storage.base-path}`, defaults to `./uploads/figures/{bookId}/`). The path is
|
||||||
|
stored in `figure.image_path` in Postgres. A `FigureStorageService` interface wraps all disk
|
||||||
|
I/O so the implementation can be swapped to S3 or another object store without changing
|
||||||
|
callers.
|
||||||
|
|
||||||
|
**Rationale**: Local disk is the simplest viable option for a POC with <10 users. The interface
|
||||||
|
boundary satisfies Constitution Principle II (Easy to Change).
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- S3 from day 1 → operational overhead not justified at POC scale
|
||||||
|
- Base64 in Postgres JSONB → bloats DB; complicates backup; query performance degrades
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 7: Figure Type Classification
|
||||||
|
|
||||||
|
**Decision**: Use the enum `FigureType { ANATOMICAL_DIAGRAM, SURGICAL_PHOTOGRAPH, MRI_CT_SCAN,
|
||||||
|
TABLE, CHART, INTRAOPERATIVE_IMAGE }`. Classification is derived from:
|
||||||
|
1. Caption keywords ("MRI", "CT", "Fig.", "Table") — heuristic, no model needed
|
||||||
|
2. Fall back to `ANATOMICAL_DIAGRAM` if unclassifiable
|
||||||
|
|
||||||
|
**Rationale**: Allows the frontend to render different icon/label per type (e.g., "MRI" badge).
|
||||||
|
Heuristic classification avoids a separate model call per image at extraction time.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- Vision model classification → accurate but adds latency and cost per figure; deferrable
|
||||||
|
- Single `FIGURE` type → loses citation granularity required by spec FR-004
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 8: Metadata Schema for Vector Store Documents
|
||||||
|
|
||||||
|
**Decision**: All vector store documents carry a flat `Map<String, Object>` metadata for Spring
|
||||||
|
AI filtering. Schema:
|
||||||
|
|
||||||
|
| Field | Text Chunk | Figure Chunk |
|
||||||
|
|-------|-----------|-------------|
|
||||||
|
| `type` | `"TEXT"` | `"FIGURE"` |
|
||||||
|
| `book_id` | ✓ | ✓ |
|
||||||
|
| `book_title` | ✓ | ✓ |
|
||||||
|
| `chapter_id` | ✓ | ✓ |
|
||||||
|
| `section_id` | ✓ | ✓ |
|
||||||
|
| `section_title` | ✓ | ✓ |
|
||||||
|
| `page_start` | ✓ | — |
|
||||||
|
| `page_end` | ✓ | — |
|
||||||
|
| `chunk_index` | ✓ | — |
|
||||||
|
| `total_chunks` | ✓ | — |
|
||||||
|
| `figure_id` | — | ✓ |
|
||||||
|
| `figure_type` | — | ✓ |
|
||||||
|
| `image_path` | — | ✓ |
|
||||||
|
| `label` | — | ✓ |
|
||||||
|
| `page` | — | ✓ |
|
||||||
|
|
||||||
|
**Rationale**: Flat map is required by Spring AI `FilterExpressionBuilder`. Separation by `type`
|
||||||
|
allows independent filtering in dual search.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 9: Re-embedding Existing Books
|
||||||
|
|
||||||
|
**Decision**: Books already processed under feature 001 (text-only) are NOT automatically
|
||||||
|
re-embedded. An explicit re-embed action is exposed via `POST /api/v1/books/{id}/reembed`
|
||||||
|
(admin-triggered). The existing chunks remain valid for text queries until re-embedding completes.
|
||||||
|
|
||||||
|
**Rationale**: Automatic re-embedding on deploy would block the system and risk data loss if
|
||||||
|
the process fails mid-way. An explicit, idempotent trigger is safer and more observable.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 10: Minimum Image Size Threshold
|
||||||
|
|
||||||
|
**Decision**: Images smaller than 100×100 pixels are discarded and no chunk is created. This
|
||||||
|
threshold filters out decorative elements (bullets, dividers, publisher logos) without a
|
||||||
|
classification model.
|
||||||
|
|
||||||
|
**Rationale**: Neurosurgery textbook diagrams and MRI scans are never smaller than 100×100 px.
|
||||||
|
The threshold is configurable via `app.figure-storage.min-image-size-px` in
|
||||||
|
`application.properties`.
|
||||||
|
|
||||||
|
**Alternatives considered**:
|
||||||
|
- No threshold → decorative icons pollute the figure index
|
||||||
|
- ML-based classification → accurate but adds model dependency; not needed at POC scale
|
||||||
@@ -0,0 +1,176 @@
|
|||||||
|
# Feature Specification: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Feature Branch**: `002-image-aware-embedding`
|
||||||
|
**Created**: 2026-04-03
|
||||||
|
**Status**: Draft
|
||||||
|
**Input**: User description: "I want to enhance the embedding process. I want also parse image from each pages if any and add proper metadata so that it can match the retrieved chunk/vector that match what user are querying."
|
||||||
|
|
||||||
|
## User Scenarios & Testing *(mandatory)*
|
||||||
|
|
||||||
|
### User Story 1 - Image Content Surfaced in Query Results (Priority: P1)
|
||||||
|
|
||||||
|
A neurosurgeon asks a question in the chat (e.g., "Show me the anatomy of the Circle of Willis")
|
||||||
|
that is best answered by a diagram or figure in an uploaded book. The system retrieves the image
|
||||||
|
content — its description and surrounding context — and uses it to construct a grounded answer,
|
||||||
|
citing the page and book where the image appeared.
|
||||||
|
|
||||||
|
**Why this priority**: This is the direct, user-visible payoff of the feature. Without it, the
|
||||||
|
enhancement has no observable benefit. All other stories support this outcome.
|
||||||
|
|
||||||
|
**Independent Test**: Upload a book containing a labelled anatomical diagram. Ask a query whose
|
||||||
|
answer is conveyed by that diagram (not in the surrounding text). Confirm the system returns an
|
||||||
|
answer that references the diagram's content and cites the correct book and page.
|
||||||
|
|
||||||
|
**Acceptance Scenarios**:
|
||||||
|
|
||||||
|
1. **Given** a book with an anatomical diagram on page 42, **When** a user asks a question whose
|
||||||
|
answer is only depicted in that diagram, **Then** the system returns a response that draws on
|
||||||
|
the diagram's content and cites "Page 42, [Book Title]".
|
||||||
|
2. **Given** a page with both text and an image, **When** the system retrieves that page's content,
|
||||||
|
**Then** the image-derived content and the surrounding text are each independently retrievable
|
||||||
|
and independently citable.
|
||||||
|
3. **Given** a query that has no relevant image in any uploaded book, **When** the system searches,
|
||||||
|
**Then** it does not fabricate image-derived content and falls back to text-only results (or
|
||||||
|
states no relevant content was found).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### User Story 2 - All Pages Scanned for Images During Embedding (Priority: P1)
|
||||||
|
|
||||||
|
When a book is uploaded and processed, every page is inspected for images. Any image found is
|
||||||
|
extracted and represented as a searchable content chunk enriched with metadata (page number,
|
||||||
|
book title, position on page, caption if present). Pages without images are processed as
|
||||||
|
text-only chunks, unchanged from the existing behaviour.
|
||||||
|
|
||||||
|
**Why this priority**: This is the prerequisite for User Story 1. Without systematic per-page
|
||||||
|
image detection, image content cannot be retrieved.
|
||||||
|
|
||||||
|
**Independent Test**: Upload a book whose pages include a mix of text-only and image-containing
|
||||||
|
pages. After processing completes, verify that chunks exist for each image page and that each
|
||||||
|
image chunk carries the correct metadata (page number, source book, caption).
|
||||||
|
|
||||||
|
**Acceptance Scenarios**:
|
||||||
|
|
||||||
|
1. **Given** a book being processed, **When** the embedding pipeline runs, **Then** every page
|
||||||
|
is evaluated for images and each detected image generates at least one content chunk.
|
||||||
|
2. **Given** an image with a caption or label, **When** the chunk is created, **Then** the
|
||||||
|
caption or label text is included in the chunk's content and metadata.
|
||||||
|
3. **Given** a page with multiple images, **When** processing completes, **Then** each image is
|
||||||
|
represented as a separate chunk with its own metadata, not merged into a single chunk.
|
||||||
|
4. **Given** a page with no images, **When** processing completes, **Then** no image chunk is
|
||||||
|
created for that page and text processing is unaffected.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### User Story 3 - Rich Metadata Enables Precise Source Attribution (Priority: P2)
|
||||||
|
|
||||||
|
When the system returns a result based on image content, the user can see exactly where that
|
||||||
|
image appeared: which book, which page, and what type of content (diagram, table, photograph,
|
||||||
|
etc.). This gives the user confidence in the source and lets them locate the original image
|
||||||
|
in their physical or digital copy of the book.
|
||||||
|
|
||||||
|
**Why this priority**: Metadata quality directly impacts user trust. Neurosurgeons require
|
||||||
|
traceable, citable evidence. Richer metadata also improves retrieval accuracy by giving the
|
||||||
|
search engine more signals to match against a query.
|
||||||
|
|
||||||
|
**Independent Test**: Retrieve a result sourced from an image chunk. Inspect the displayed
|
||||||
|
citation and verify it includes: book title, page number, content type (e.g., "diagram"),
|
||||||
|
and caption (if present in the original).
|
||||||
|
|
||||||
|
**Acceptance Scenarios**:
|
||||||
|
|
||||||
|
1. **Given** a retrieved image chunk, **When** the system displays the source citation,
|
||||||
|
**Then** the citation includes at minimum: book title, page number, and a content-type
|
||||||
|
label (e.g., diagram, table, figure).
|
||||||
|
2. **Given** an image chunk with a detected caption, **When** the citation is displayed,
|
||||||
|
**Then** the caption text is shown alongside the other metadata fields.
|
||||||
|
3. **Given** a topic summary that draws on both text and image chunks, **When** the user
|
||||||
|
inspects citations, **Then** image-sourced and text-sourced claims are distinguishable
|
||||||
|
from each other.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Edge Cases
|
||||||
|
|
||||||
|
- What happens when an image is too small to contain meaningful content (e.g., a decorative
|
||||||
|
bullet icon or a publisher logo)?
|
||||||
|
- How does the system handle a page that is entirely an image (scanned page with no digital text)?
|
||||||
|
- What if an image spans multiple pages (e.g., a fold-out diagram)?
|
||||||
|
- How does the system behave when an image has no caption and its surrounding text provides
|
||||||
|
no useful context?
|
||||||
|
- What happens if image processing fails for a specific page — does it abort the whole book
|
||||||
|
or continue with the remaining pages?
|
||||||
|
|
||||||
|
## Requirements *(mandatory)*
|
||||||
|
|
||||||
|
### Functional Requirements
|
||||||
|
|
||||||
|
- **FR-001**: System MUST inspect every page of an uploaded book for the presence of images
|
||||||
|
during the embedding process.
|
||||||
|
- **FR-002**: System MUST extract each detected image and create a dedicated, independently
|
||||||
|
searchable content chunk for it.
|
||||||
|
- **FR-003**: System MUST generate a descriptive textual representation of each extracted
|
||||||
|
image so its content is semantically searchable by the retrieval system.
|
||||||
|
- **FR-004**: System MUST associate the following metadata with every image chunk: book title,
|
||||||
|
page number, content type (e.g., diagram, table, figure, photograph), and caption text
|
||||||
|
(where present).
|
||||||
|
- **FR-005**: System MUST include the same base metadata (book title, page number) on text
|
||||||
|
chunks so that all retrieved content — image or text — carries consistent, comparable
|
||||||
|
source attribution.
|
||||||
|
- **FR-006**: System MUST treat image chunks as first-class retrievable units: they must be
|
||||||
|
ranked and returned alongside text chunks when they are relevant to a user query.
|
||||||
|
- **FR-007**: System MUST skip images that fall below a minimum meaningful-content threshold
|
||||||
|
(e.g., decorative icons, page separators) and MUST NOT create chunks for them.
|
||||||
|
- **FR-008**: If image processing fails for a specific page, the system MUST log the failure,
|
||||||
|
skip that page's image, and continue processing the remaining pages and text content of
|
||||||
|
the book.
|
||||||
|
- **FR-009**: System MUST display image-sourced content citations distinctly from text-sourced
|
||||||
|
citations so users can identify when a result originates from a visual element.
|
||||||
|
- **FR-010**: Processing a book that contains images MUST NOT degrade the accuracy or
|
||||||
|
completeness of the existing text-only embedding for that book.
|
||||||
|
|
||||||
|
### Key Entities
|
||||||
|
|
||||||
|
- **Image Chunk**: A searchable content unit derived from a page image. Attributes: generated
|
||||||
|
description, source book title, page number, content type, caption (optional), embedding vector.
|
||||||
|
- **Text Chunk**: Existing unit; extended to carry explicit metadata: source book title,
|
||||||
|
page number, section heading (if detectable), content type ("text").
|
||||||
|
- **Chunk Metadata**: Structured attributes attached to every chunk regardless of type,
|
||||||
|
enabling consistent filtering and citation. Mandatory fields: book title, page number,
|
||||||
|
content type. Optional fields: caption, section heading.
|
||||||
|
|
||||||
|
## Success Criteria *(mandatory)*
|
||||||
|
|
||||||
|
### Measurable Outcomes
|
||||||
|
|
||||||
|
- **SC-001**: At least 90% of pages containing images in a test book result in a retrievable
|
||||||
|
image chunk after processing completes.
|
||||||
|
- **SC-002**: A controlled set of 10 queries whose answers are conveyed by diagrams in an
|
||||||
|
uploaded book returns at least 7 correct image-sourced answers (70% recall on image queries).
|
||||||
|
- **SC-003**: Embedding processing time for a book with images increases by no more than 3×
|
||||||
|
compared to processing the same book as text-only, for books up to 500 pages.
|
||||||
|
- **SC-004**: Every retrieved result — text or image — includes a citation that identifies
|
||||||
|
at minimum the source book title and page number, with 100% coverage across a test result set.
|
||||||
|
- **SC-005**: In a user evaluation with 5 representative queries that previously returned
|
||||||
|
no useful results (because the answer was only in a diagram), at least 4 now return a
|
||||||
|
useful, grounded answer.
|
||||||
|
|
||||||
|
## Assumptions
|
||||||
|
|
||||||
|
- Books are still uploaded exclusively as PDFs; image parsing applies to PDF pages only.
|
||||||
|
- The platform already has a working text-only embedding pipeline (from feature 001); this
|
||||||
|
feature enhances it without replacing or rewriting the text processing logic.
|
||||||
|
- Images worth processing are those that occupy a meaningful portion of the page; small
|
||||||
|
decorative or structural images (logos, dividers, icons) are excluded based on a size
|
||||||
|
threshold determined during implementation.
|
||||||
|
- The descriptive representation of an image (FR-003) is generated at embedding time, not
|
||||||
|
at query time; query latency is not affected by image interpretation.
|
||||||
|
- The shared global book library model from feature 001 is retained; image chunks from a
|
||||||
|
processed book are available to all users immediately upon completion.
|
||||||
|
- Scanned pages (fully rasterised pages with no digital text layer) are treated as a single
|
||||||
|
full-page image; the system attempts to extract content from them but does not guarantee
|
||||||
|
the same fidelity as pages with digital text.
|
||||||
|
- Per-chunk metadata is stored alongside the vector so it can be used for both retrieval
|
||||||
|
filtering and source citation display without a separate lookup.
|
||||||
|
- Books already processed under feature 001 (text-only) are not automatically re-processed;
|
||||||
|
re-embedding must be triggered explicitly by the user or an administrator.
|
||||||
@@ -0,0 +1,168 @@
|
|||||||
|
# Tasks: Enhanced Embedding with Image Parsing and Metadata
|
||||||
|
|
||||||
|
**Input**: Design documents from `/specs/002-image-aware-embedding/`
|
||||||
|
**Prerequisites**: plan.md ✓ | spec.md ✓ | research.md ✓ | data-model.md ✓ | contracts/ ✓
|
||||||
|
|
||||||
|
**Organization**: Tasks grouped by user story to enable independent implementation and testing.
|
||||||
|
|
||||||
|
## Format: `[ID] [P?] [Story] Description`
|
||||||
|
|
||||||
|
- **[P]**: Can run in parallel (different files, no shared dependencies)
|
||||||
|
- **[US1/US2/US3]**: Which user story this task belongs to
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 1: Setup (Shared Infrastructure)
|
||||||
|
|
||||||
|
**Purpose**: Database migrations and configuration that establish the foundation for all new code
|
||||||
|
|
||||||
|
- [X] T001 Create Flyway migration `V4__document_hierarchy.sql` — add `chapter` and `section` tables per data-model.md §Postgres Schema in `backend/src/main/resources/db/migration/V4__document_hierarchy.sql`
|
||||||
|
- [X] T002 Create Flyway migration `V5__figures_and_refs.sql` — add `figure` and `chunk_figure_ref` tables per data-model.md §Postgres Schema in `backend/src/main/resources/db/migration/V5__figures_and_refs.sql`
|
||||||
|
- [X] T003 Add figure-storage configuration keys to `backend/src/main/resources/application.properties`: `app.figure-storage.base-path=./uploads` and `app.figure-storage.min-image-size-px=100`
|
||||||
|
- [X] T004 Add `uploads/` directory to `.gitignore` at repo root; create `uploads/figures/.gitkeep` to preserve directory structure
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 2: Foundational (Blocking Prerequisites)
|
||||||
|
|
||||||
|
**Purpose**: Core types and infrastructure that ALL user stories depend on — nothing in Phase 3+ can start until this phase is complete
|
||||||
|
|
||||||
|
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
|
||||||
|
|
||||||
|
- [X] T005 [P] Create `FigureType` enum in `backend/src/main/java/com/aiteacher/document/FigureType.java` — values: `ANATOMICAL_DIAGRAM`, `SURGICAL_PHOTOGRAPH`, `MRI_CT_SCAN`, `TABLE`, `CHART`, `INTRAOPERATIVE_IMAGE`
|
||||||
|
- [X] T006 [P] Create `FigureStorageService` interface in `backend/src/main/java/com/aiteacher/figure/FigureStorageService.java` — declare `Path save(UUID bookId, String figureId, BufferedImage image)`, `Path resolve(UUID bookId, String filename)`, and `void delete(UUID bookId)`
|
||||||
|
- [X] T007 Create `LocalFigureStorageService` implementation in `backend/src/main/java/com/aiteacher/figure/LocalFigureStorageService.java` — writes PNG files under `${app.figure-storage.base-path}/figures/{bookId}/`; implements `FigureStorageService`; depends on T006
|
||||||
|
- [X] T008 Create `FigureStorageConfig` bean in `backend/src/main/java/com/aiteacher/config/FigureStorageConfig.java` — reads `app.figure-storage.base-path` and `app.figure-storage.min-image-size-px` as `@ConfigurationProperties`; registers `LocalFigureStorageService` as `@Bean`; adds `ResourceHandler` mapping `GET /api/v1/figures/**` to the base-path directory
|
||||||
|
- [X] T009 [P] Create `ChapterEntity` JPA entity and `ChapterRepository` in `backend/src/main/java/com/aiteacher/document/` — `@Entity(name="chapter")`, fields: `id` (String PK), `bookId` (UUID FK → book), `number` (int), `title` (String), `pageStart` (int), `createdAt` (Instant); `ChapterRepository extends JpaRepository<ChapterEntity, String>`
|
||||||
|
- [X] T010 [P] Create `SectionEntity` JPA entity and `SectionRepository` in `backend/src/main/java/com/aiteacher/document/` — `@Entity(name="section")`, fields: `id` (String PK), `chapterId` (String FK → chapter), `bookId` (UUID FK → book), `number` (String), `title` (String), `pageStart`/`pageEnd` (int), `fullText` (TEXT column), `createdAt` (Instant); `SectionRepository extends JpaRepository<SectionEntity, String>` with `findAllByBookId(UUID)`
|
||||||
|
- [X] T011 [P] Create `FigureEntity` JPA entity and `FigureRepository` in `backend/src/main/java/com/aiteacher/document/` — `@Entity(name="figure")`, fields: `id` (String PK), `bookId` (UUID), `sectionId` (String, nullable), `chapterId` (String, nullable), `label` (String), `caption` (TEXT), `figureType` (`@Enumerated` FigureType), `page` (int), `imagePath` (String), `captionEmbeddingId` (UUID, nullable), `createdAt` (Instant); `FigureRepository` with `findAllByBookId(UUID)`, `deleteAllByBookId(UUID)`
|
||||||
|
- [X] T012 Create `ChunkFigureRefEntity` JPA entity and `ChunkFigureRefRepository` in `backend/src/main/java/com/aiteacher/document/` — composite PK `(chunkId UUID, figureId String)`, `mentionPage` (int); `ChunkFigureRefRepository` with `findByChunkIdIn(List<UUID>)`, `deleteByFigureIdIn(List<String>)`
|
||||||
|
|
||||||
|
**Checkpoint**: Migrations will run on next startup; all JPA entities are wired; figure storage reads config correctly
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 3: User Story 2 — All Pages Scanned for Images During Embedding (Priority: P1)
|
||||||
|
|
||||||
|
**Goal**: When a book is uploaded, every page is inspected for images; each found image is extracted, persisted, described, and embedded as a searchable chunk alongside its metadata
|
||||||
|
|
||||||
|
**Independent Test**: Upload a PDF containing at least one page with a labelled anatomical diagram. After status shows `READY`, call `GET /api/v1/books/{id}/figures` — response must contain at least one entry with `figureType`, `caption`, `page`, and `imageUrl` populated. Verify the PNG file exists at the path in `imagePath`.
|
||||||
|
|
||||||
|
- [X] T013 [US2] Create `PdfStructureParser` service in `backend/src/main/java/com/aiteacher/document/PdfStructureParser.java` — uses Spring AI's `PagePdfDocumentReader` to extract per-page text; groups pages into `SectionEntity` records using heading-detection heuristics (lines matching `^\d+(\.\d+)*\s+[A-Z]`); groups sections into `ChapterEntity` records; persists both to Postgres via `ChapterRepository` and `SectionRepository`; returns `List<SectionEntity>` for the book
|
||||||
|
- [X] T014 [US2] Create `FigureExtractionService` in `backend/src/main/java/com/aiteacher/document/FigureExtractionService.java` — opens PDF with PDFBox `PDDocument`; iterates pages; extracts `PDImageXObject` instances; skips images whose width or height are below `min-image-size-px`; classifies `FigureType` using the keyword-matching table from data-model.md §FigureType; parses caption from the nearest text line matching `CAPTION_PATTERN`; saves PNG via `FigureStorageService`; persists `FigureEntity` to `FigureRepository`; returns `List<FigureEntity>` per book
|
||||||
|
- [X] T015 [US2] Create `VisionDescriptionService` in `backend/src/main/java/com/aiteacher/document/VisionDescriptionService.java` — accepts a `Path` to a PNG and a caption String; calls the OpenAI vision model (via Spring AI `ChatClient` with image media type) to generate a 2–4 sentence clinical description; returns the generated description string; handles API failures by returning the caption as fallback
|
||||||
|
- [X] T016 [US2] Create `TextChunkingService` in `backend/src/main/java/com/aiteacher/document/TextChunkingService.java` — accepts a `SectionEntity`; splits `fullText` into overlapping 400–600 token windows (20-token overlap); wraps each window in a Spring AI `Document` with the flat metadata map defined in data-model.md §Text chunk document; returns `List<Document>`
|
||||||
|
- [X] T017 [US2] Create `ChunkFigureRefService` in `backend/src/main/java/com/aiteacher/document/ChunkFigureRefService.java` — accepts a Spring AI `Document` (with its `id` as `chunkId`) and a `List<FigureEntity>` for the book; scans chunk text for patterns `Fig\.\s*\d+[\-\.]\d+` and `Figure\s+\d+[\-\.]\d+`; matches against figure labels; persists `ChunkFigureRefEntity` rows via `ChunkFigureRefRepository`
|
||||||
|
- [X] T018 [US2] Rewrite `BookEmbeddingService.embedBook()` in `backend/src/main/java/com/aiteacher/book/BookEmbeddingService.java` to orchestrate the full pipeline: (1) `PdfStructureParser` → sections; (2) parallel: `FigureExtractionService` + `TextChunkingService` for each section; (3) `VisionDescriptionService` for each figure; (4) embed figure captions+descriptions as `Document`s (metadata per data-model.md §Figure caption document) into `vectorStore`; (5) embed text chunks into `vectorStore`; (6) `ChunkFigureRefService` for each chunk; update `captionEmbeddingId` on `FigureEntity` after embedding
|
||||||
|
- [X] T019 [US2] Extend `BookEmbeddingService.deleteBookChunks()` to also delete: all `ChunkFigureRefEntity` rows (via `findByFigureIdIn`), all `FigureEntity` rows (via `deleteAllByBookId`), all figure PNG files (via `FigureStorageService.delete(bookId)`), all `SectionEntity` and `ChapterEntity` rows for the book
|
||||||
|
- [X] T020 [US2] Add `POST /api/v1/books/{id}/reembed` endpoint to `BookController` in `backend/src/main/java/com/aiteacher/book/BookController.java` — returns `202` with `{ bookId, status: "PROCESSING" }`; returns `404` if not found; returns `409` if already `PROCESSING`; calls `deleteBookChunks()` then `embedBook()` asynchronously
|
||||||
|
|
||||||
|
**Checkpoint**: Upload a PDF with figures → poll `GET /api/v1/books` for `READY` → `GET /api/v1/books/{id}/figures` returns figure list → PNG accessible at `GET /api/v1/figures/{bookId}/{filename}`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 4: User Story 1 — Image Content Surfaced in Query Results (Priority: P1)
|
||||||
|
|
||||||
|
**Goal**: User asks a question answered by a diagram — the system retrieves that diagram's content and surfaces it in the chat response with a citation
|
||||||
|
|
||||||
|
**Independent Test**: With a book embedded (Phase 3 checkpoint passed), ask a chat question whose answer is depicted only in a diagram. The response `sources` array must contain at least one entry with `type: "FIGURE"` and a non-empty `imageUrl`.
|
||||||
|
|
||||||
|
- [X] T021 [US1] Create `NeurosurgeryRetriever` service in `backend/src/main/java/com/aiteacher/retrieval/NeurosurgeryRetriever.java` — (1) text chunk search: `vectorStore.similaritySearch` with filter `type == TEXT AND book_id == bookId`, topK=5; (2) figure search: same store, filter `type == FIGURE AND book_id == bookId`, topK=3; (3) expand text chunk results to parent sections via `SectionRepository.findAllById(sectionIds)`; (4) fetch explicitly linked figures via `ChunkFigureRefRepository.findByChunkIdIn(chunkIds)` + `FigureRepository.findAllById`; (5) deduplicate figures across lists by `figureId`; return `RetrievalResult(parentSections, figureVectorHits, linkedFigures)` — add `RetrievalResult` record in same package
|
||||||
|
- [X] T022 [US1] Refactor `ChatService.sendMessage()` in `backend/src/main/java/com/aiteacher/chat/ChatService.java` — replace `QuestionAnswerAdvisor` with a manual call to `NeurosurgeryRetriever`; build the LLM user message from: section full texts as `[Section X.Y — Title, pp.A-B]\n{fullText}` blocks, followed by `AVAILABLE FIGURES FOR THIS SECTION:` list with `- {label} (p.{page}): {caption} [image: {filename}]` lines per figure; append the instruction `When referencing diagrams, cite them as [Fig. X, p.N].`; send via `chatClient.prompt().system(SYSTEM_PROMPT).user(prompt).call()`
|
||||||
|
- [X] T023 [US1] Add `GET /api/v1/books/{id}/figures` endpoint to `BookController` — returns `200` with `List<FigureResponse>`; `FigureResponse` is a new record in `backend/src/main/java/com/aiteacher/book/FigureResponse.java` with fields `figureId`, `label`, `caption`, `figureType`, `page`, `imageUrl` (assembled as `/api/v1/figures/{bookId}/{filename}`), `sectionId`, `sectionTitle`; returns `404` if book not found
|
||||||
|
- [X] T024 [US1] Update `extractSources()` in `ChatService` to build both TEXT and FIGURE source entries: TEXT entries keep existing fields plus `"type": "TEXT"`; FIGURE entries add `"type": "FIGURE"`, `"figureId"`, `"label"`, `"caption"`, `"figureType"`, `"imageUrl"` — source data comes from `RetrievalResult` (text chunk Documents and merged FigureEntity list)
|
||||||
|
|
||||||
|
**Checkpoint**: Chat question answered by a diagram → response body contains `sources[n].type == "FIGURE"` with populated `imageUrl`; image loads from the returned URL
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 5: User Story 3 — Rich Metadata Enables Precise Source Attribution (Priority: P2)
|
||||||
|
|
||||||
|
**Goal**: Users see distinct, informative citations for text vs. image sources; image sources render inline in the chat UI
|
||||||
|
|
||||||
|
**Independent Test**: After triggering a response with figure sources, inspect the chat message in the UI — text sources and figure sources are visually distinguishable; figure sources render the actual image inline using the `imageUrl`
|
||||||
|
|
||||||
|
- [X] T025 [P] [US3] Update API response types in `frontend/src/services/api.ts` — extend the `Source` type to include `type: 'TEXT' | 'FIGURE'`, `figureId?: string`, `label?: string`, `caption?: string`, `figureType?: string`, `imageUrl?: string`
|
||||||
|
- [X] T026 [P] [US3] Update the chat source/citation display in the frontend (wherever sources are currently rendered, e.g. `frontend/src/components/` or `frontend/src/views/`) — render TEXT sources with a document icon and page number; render FIGURE sources with the image (`<img :src="source.imageUrl">`) below the label and caption text
|
||||||
|
- [X] T027 [US3] Add figure-type badge rendering in the frontend figure display: show a label derived from `figureType` (e.g. "MRI / CT", "Anatomical Diagram", "Table") alongside the figure caption so users can identify content type without opening the image
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 6: Polish & Cross-Cutting Concerns
|
||||||
|
|
||||||
|
- [X] T028 Update `README.md` Mermaid architecture diagram to show three storage tiers: pgvector (semantic search), Postgres (source of truth — sections, figures, refs), and file store (extracted PNGs) — **required by Constitution Principle IV in the same PR as the other changes**
|
||||||
|
- [X] T029 [P] Write `FigureExtractionServiceTest` unit test in `backend/src/test/java/com/aiteacher/document/FigureExtractionServiceTest.java` — test: images below min size are skipped; `FigureType` classification matches keyword table in data-model.md; caption parsed from adjacent text line
|
||||||
|
- [X] T030 [P] Write `NeurosurgeryRetrieverTest` unit test in `backend/src/test/java/com/aiteacher/retrieval/NeurosurgeryRetrieverTest.java` — test: figure IDs from both vector hits and chunk refs are merged without duplicates; `RetrievalResult` contains the deduplicated set
|
||||||
|
- [X] T031 Run quickstart.md validation end-to-end: upload a real PDF with a labelled diagram → wait for `READY` → call `GET /api/v1/books/{id}/figures` → send a chat message about the diagram → verify `sources` contains a `FIGURE` entry → verify `imageUrl` resolves to a PNG
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies & Execution Order
|
||||||
|
|
||||||
|
### Phase Dependencies
|
||||||
|
|
||||||
|
- **Phase 1 (Setup)**: No dependencies — start immediately
|
||||||
|
- **Phase 2 (Foundational)**: Requires Phase 1 complete (migrations must run before JPA entities can be wired)
|
||||||
|
- **Phase 3 (US2)**: Requires Phase 2 complete — all JPA entities + FigureStorageService must exist
|
||||||
|
- **Phase 4 (US1)**: Requires Phase 3 complete — figures must exist in Postgres + vector store before retrieval can surface them
|
||||||
|
- **Phase 5 (US3)**: Requires Phase 4 complete — frontend depends on the extended `sources` format from T024
|
||||||
|
- **Phase 6 (Polish)**: Requires all story phases complete
|
||||||
|
|
||||||
|
### Within Phase 3 (Embedding Pipeline)
|
||||||
|
|
||||||
|
```
|
||||||
|
T013 (PdfStructureParser) ──────────────────────────┐
|
||||||
|
T014 (FigureExtractionService) ─────────────────────┤
|
||||||
|
T015 (VisionDescriptionService) ────────────────────┤─→ T018 (BookEmbeddingService orchestrator)
|
||||||
|
T016 (TextChunkingService) ─────────────────────────┤ └─→ T019 (cleanup)
|
||||||
|
T017 (ChunkFigureRefService) ───────────────────────┘ └─→ T020 (reembed endpoint)
|
||||||
|
```
|
||||||
|
|
||||||
|
T013–T017 can be implemented in parallel (different files, no shared dependencies). T018 depends on all of them.
|
||||||
|
|
||||||
|
### Within Phase 4 (Retrieval)
|
||||||
|
|
||||||
|
```
|
||||||
|
T021 (NeurosurgeryRetriever) ──────────────────────┐
|
||||||
|
└─→ T022 (ChatService update)
|
||||||
|
└─→ T024 (extractSources update)
|
||||||
|
T023 (figures endpoint) ── independent [P]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Parallel Opportunities per Phase
|
||||||
|
|
||||||
|
**Phase 2**: T005, T006, T009, T010, T011 can all run in parallel. T007 depends on T006. T012 can follow T010/T011.
|
||||||
|
|
||||||
|
**Phase 3**: T013, T014, T015, T016, T017 all in parallel. T018 depends on all.
|
||||||
|
|
||||||
|
**Phase 5**: T025 and T026 in parallel; T027 can follow T026.
|
||||||
|
|
||||||
|
**Phase 6**: T029 and T030 in parallel.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Strategy
|
||||||
|
|
||||||
|
### MVP: User Story 2 Only (Embedding Pipeline)
|
||||||
|
|
||||||
|
1. Phase 1 (Setup) → Phase 2 (Foundational) → Phase 3 (US2, T013–T020)
|
||||||
|
2. **Validate**: `GET /api/v1/books/{id}/figures` returns figures for a test book
|
||||||
|
3. **Stop and demo** — the pipeline produces image chunks without any retrieval changes
|
||||||
|
|
||||||
|
### Full Feature Delivery
|
||||||
|
|
||||||
|
1. Phase 1 + 2 → Foundation ready
|
||||||
|
2. Phase 3 (US2) → Embedding pipeline produces image chunks ← **demo point**
|
||||||
|
3. Phase 4 (US1) → Chat surfaces image content in responses ← **core payoff**
|
||||||
|
4. Phase 5 (US3) → Frontend renders inline figures with type badges
|
||||||
|
5. Phase 6 (Polish) → README, tests, end-to-end validation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- [P] tasks = different files, no dependencies on each other within the same phase
|
||||||
|
- [US1/US2/US3] label maps each task to a user story for traceability
|
||||||
|
- Phase 3 (US2) must be fully complete before beginning Phase 4 (US1) — retrieval cannot surface figures that do not yet exist
|
||||||
|
- The `uploads/figures/` directory must exist and be writable at runtime; `FigureStorageService` creates subdirectories automatically
|
||||||
|
- Re-embedding (T020) deletes all existing chunks and figures for the book before re-running — safe to call on books processed by feature 001
|
||||||
Reference in New Issue
Block a user