2.8 KiB
2.8 KiB
Internal Contract: DocumentAiPageParser → FigureExtractionService
Branch: 002-image-aware-embedding | Date: 2026-04-04
Type: Internal Java DTO (not an HTTP contract)
Purpose
PageResult is the internal data transfer object produced by DocumentAiPageParser for each
PDF page. It decouples the Google Document AI SDK types from the rest of the pipeline so that
PdfStructureParser can be replaced without cascading changes.
Java Record
package com.aiteacher.document;
import java.util.List;
/**
* Internal DTO produced by DocumentAiPageParser for one PDF page.
* Decouples the Document AI SDK types from downstream services.
*/
public record PageResult(
int pageNumber, // 1-based, matches Document.Page.getPageNumber()
String orderedText, // full page text in correct reading order (blocks joined by \n\n)
String headingTitle, // first HEADING block on page, or null
List<FigureBbox> figures // detected figure regions (may be empty)
) {
/**
* Normalized bounding box for a detected figure region.
* Coordinates are in the [0.0, 1.0] range relative to page dimensions.
*/
public record FigureBbox(
float x, // left edge (normalized)
float y, // top edge (normalized)
float width, // width (normalized)
float height, // height (normalized)
String nearestCaption // text of adjacent paragraph block, or null
) {}
}
Production Rules
| Field | Rule |
|---|---|
orderedText |
Concatenation of all PARAGRAPH and HEADING_* blocks, joined with \n\n. Tables are represented as tab-separated text. |
headingTitle |
First block whose blockType is HEADING_1 through HEADING_6. null if no heading detected. |
figures |
One entry per VisualElement with type == "figure" and confidence ≥ 0.5. Sorted top-to-bottom by y. |
nearestCaption |
The PARAGRAPH block immediately following the figure bbox (by Y coordinate). May be null if no paragraph follows within 10% of page height. |
Mapping from Document AI Proto
Document.Page.Block → orderedText (concatenated)
Document.Page.Block (HEADING_*) → headingTitle (first match)
Document.Page.VisualElement → FigureBbox
└─ layout.bounding_poly.normalized_vertices[0] → (x, y) top-left
└─ normalized_vertices[2] → (x+w, y+h) bottom-right
Consumers
| Consumer | What It Uses |
|---|---|
BookEmbeddingService |
orderedText → SectionEntity.fullText; headingTitle → SectionEntity.title |
FigureExtractionService |
figures list → renders page via PDFBox, crops each bbox to BufferedImage |
TextChunkingService |
Receives SectionEntity (indirectly uses orderedText) — unchanged |