Compare commits
15 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 820734c251 | |||
| 0711e40c66 | |||
| 0db31e91ab | |||
| d480d04145 | |||
| c2d034d1fe | |||
| 0908355704 | |||
| 8e227a9429 | |||
| d8bcdce879 | |||
| aee6a9dfba | |||
| 0cf318f0a7 | |||
| e5d53b4e80 | |||
| 5c641f4bcc | |||
| ea1276dc2e | |||
| b154e29f2d | |||
| 5acfdd33c1 |
@@ -0,0 +1,27 @@
|
||||
.git/
|
||||
.gitignore
|
||||
*.md
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Java build artifacts
|
||||
target/
|
||||
*.class
|
||||
*.jar
|
||||
|
||||
# Node
|
||||
node_modules/
|
||||
dist/
|
||||
*.log
|
||||
|
||||
# Env files (never bake secrets into images)
|
||||
.env
|
||||
.env.*
|
||||
!.env.example
|
||||
|
||||
# Spec / docs
|
||||
specs/
|
||||
|
||||
# Editor
|
||||
.vscode/
|
||||
.idea/
|
||||
@@ -0,0 +1,13 @@
|
||||
# Copy this file to .env and fill in your values before running docker-compose.native.yml
|
||||
# .env is gitignored — never commit real credentials
|
||||
|
||||
# OpenAI
|
||||
OPENAI_API_KEY=sk-...
|
||||
|
||||
# AWS S3 (figure storage — leave blank if using local filesystem)
|
||||
AWS_ACCESS_KEY_ID=
|
||||
AWS_SECRET_ACCESS_KEY=
|
||||
AWS_REGION=eu-west-1
|
||||
|
||||
# S3 bucket name (if S3 storage enabled)
|
||||
APP_STORAGE_S3_BUCKET=ai-teacher-figures
|
||||
+12
@@ -1,3 +1,15 @@
|
||||
# Runtime uploads (extracted figures)
|
||||
uploads/
|
||||
|
||||
# Java build
|
||||
target/
|
||||
*.class
|
||||
*.jar
|
||||
|
||||
# Node
|
||||
node_modules/
|
||||
dist/
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
@@ -1,8 +1,23 @@
|
||||
# ai-teacher Development Guidelines
|
||||
|
||||
Auto-generated from all feature plans. Last updated: 2026-03-31
|
||||
Auto-generated from all feature plans. Last updated: 2026-04-10
|
||||
|
||||
## Active Technologies
|
||||
- Java 25 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (embeddings + chat), PDFBox (via Spring AI PDF reader dependency) (002-image-aware-embedding)
|
||||
- PostgreSQL (JPA + Flyway), pgvector (Spring AI `VectorStore`), local file system (extracted images — `/uploads/figures/`) (002-image-aware-embedding)
|
||||
- Java 25 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API, PDFBox (rendering only), `com.google.cloud:google-cloud-documentai` (~2.40.x) (002-image-aware-embedding)
|
||||
- PostgreSQL (JPA + Flyway), pgvector (Spring AI VectorStore), S3 / local filesystem (figure images) (002-image-aware-embedding)
|
||||
- PostgreSQL (JPA + Flyway), pgvector (Spring AI `VectorStore`), S3-compatible (002-image-aware-embedding)
|
||||
- Java 21 (backend) / TypeScript + Node 20 (frontend) + Spring Boot 4.0.5, Spring Security (already included), Vue 3.4, Vue Router 4.3, Pinia 2.1, Axios 1.7 (003-basic-login)
|
||||
- No new storage — credentials held in browser `sessionStorage` (frontend only) (003-basic-login)
|
||||
- Java 21 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (chat + embeddings), pgvector, Vue 3.4, Pinia 2.1 (004-rag-retrieval-quality)
|
||||
- PostgreSQL (sections, figures, messages — unchanged). No new tables needed. (004-rag-retrieval-quality)
|
||||
- Java 21 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (chat + embeddings), Vue 3.4, Pinia 2.1, Axios 1.7 (004-rag-retrieval-quality)
|
||||
- PostgreSQL (JPA + Flyway), pgvector (`VectorStore`) (004-rag-retrieval-quality)
|
||||
- Java 25 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, `native-maven-plugin` 0.10.6, (005-native-image-deployment)
|
||||
- PostgreSQL 16 + pgvector (unchanged) (005-native-image-deployment)
|
||||
- TypeScript / Node 20 (frontend only) + Vue 3.4, Vue Router 4.3, Pinia 2.1 — no changes (006-mobile-responsive-ui)
|
||||
- N/A (frontend-only change) (006-mobile-responsive-ui)
|
||||
|
||||
- Java 21 (backend), TypeScript / Node 20 (frontend) (001-neuro-rag-learning)
|
||||
|
||||
@@ -22,8 +37,10 @@ npm test && npm run lint
|
||||
Java 21 (backend), TypeScript / Node 20 (frontend): Follow standard conventions
|
||||
|
||||
## Recent Changes
|
||||
- 006-mobile-responsive-ui: Added TypeScript / Node 20 (frontend only) + Vue 3.4, Vue Router 4.3, Pinia 2.1 — no changes
|
||||
- 005-native-image-deployment: Added Java 25 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, `native-maven-plugin` 0.10.6,
|
||||
- 004-rag-retrieval-quality: Added Java 21 (backend), TypeScript / Node 20 (frontend) + Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (chat + embeddings), Vue 3.4, Pinia 2.1, Axios 1.7
|
||||
|
||||
- 001-neuro-rag-learning: Added Java 21 (backend), TypeScript / Node 20 (frontend)
|
||||
|
||||
<!-- MANUAL ADDITIONS START -->
|
||||
<!-- MANUAL ADDITIONS END -->
|
||||
|
||||
@@ -0,0 +1,566 @@
|
||||
# Marker
|
||||
|
||||
Marker converts documents to markdown, JSON, chunks, and HTML quickly and accurately.
|
||||
|
||||
- Converts PDF, image, PPTX, DOCX, XLSX, HTML, EPUB files in all languages
|
||||
- Formats tables, forms, equations, inline math, links, references, and code blocks
|
||||
- Extracts and saves images
|
||||
- Removes headers/footers/other artifacts
|
||||
- Extensible with your own formatting and logic
|
||||
- Does structured extraction, given a JSON schema (beta)
|
||||
- Optionally boost accuracy with LLMs (and your own prompt)
|
||||
- Works on GPU, CPU, or MPS
|
||||
|
||||
For our managed API or on-prem document intelligence solution, check out [our platform here](https://datalab.to?utm_source=gh-marker).
|
||||
|
||||
## Performance
|
||||
|
||||
<img src="data/images/overall.png" width="800px"/>
|
||||
|
||||
Marker benchmarks favorably compared to cloud services like Llamaparse and Mathpix, as well as other open source tools.
|
||||
|
||||
The above results are running single PDF pages serially. Marker is significantly faster when running in batch mode, with a projected throughput of 25 pages/second on an H100.
|
||||
|
||||
See [below](#benchmarks) for detailed speed and accuracy benchmarks, and instructions on how to run your own benchmarks.
|
||||
|
||||
## Hybrid Mode
|
||||
|
||||
For the highest accuracy, pass the `--use_llm` flag to use an LLM alongside marker. This will do things like merge tables across pages, handle inline math, format tables properly, and extract values from forms. It can use any gemini or ollama model. By default, it uses `gemini-2.0-flash`. See [below](#llm-services) for details.
|
||||
|
||||
Here is a table benchmark comparing marker, gemini flash alone, and marker with use_llm:
|
||||
|
||||
<img src="data/images/table.png" width="400px"/>
|
||||
|
||||
As you can see, the use_llm mode offers higher accuracy than marker or gemini alone.
|
||||
|
||||
## Examples
|
||||
|
||||
| PDF | File type | Markdown | JSON |
|
||||
|-----|-----------|------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------|
|
||||
| [Think Python](https://greenteapress.com/thinkpython/thinkpython.pdf) | Textbook | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/markdown/thinkpython/thinkpython.md) | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/json/thinkpython.json) |
|
||||
| [Switch Transformers](https://arxiv.org/pdf/2101.03961.pdf) | arXiv paper | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/markdown/switch_transformers/switch_trans.md) | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/json/switch_trans.json) |
|
||||
| [Multi-column CNN](https://arxiv.org/pdf/1804.07821.pdf) | arXiv paper | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/markdown/multicolcnn/multicolcnn.md) | [View](https://github.com/VikParuchuri/marker/blob/master/data/examples/json/multicolcnn.json) |
|
||||
|
||||
# Commercial usage
|
||||
|
||||
Our model weights use a modified AI Pubs Open Rail-M license (free for research, personal use, and startups under $2M funding/revenue) and our code is GPL. For broader commercial licensing or to remove GPL requirements, visit our pricing page [here](https://www.datalab.to/pricing?utm_source=gh-marker).
|
||||
|
||||
# Hosted API & On-prem
|
||||
|
||||
There's a [hosted API](https://www.datalab.to?utm_source=gh-marker) and [painless on-prem solution](https://www.datalab.to/blog/self-serve-on-prem-licensing) for marker - it's free to sign up, and we'll throw in credits for you to test it out.
|
||||
|
||||
The API:
|
||||
- Supports PDF, image, PPT, PPTX, DOC, DOCX, XLS, XLSX, HTML, EPUB files
|
||||
- Is 1/4th the price of leading cloud-based competitors
|
||||
- Fast - ~15s for a 250 page PDF
|
||||
- Supports LLM mode
|
||||
- High uptime (99.99%)
|
||||
|
||||
# Community
|
||||
|
||||
[Discord](https://discord.gg//KuZwXNGnfH) is where we discuss future development.
|
||||
|
||||
# Installation
|
||||
|
||||
You'll need python 3.10+ and [PyTorch](https://pytorch.org/get-started/locally/).
|
||||
|
||||
Install with:
|
||||
|
||||
```shell
|
||||
pip install marker-pdf
|
||||
```
|
||||
|
||||
If you want to use marker on documents other than PDFs, you will need to install additional dependencies with:
|
||||
|
||||
```shell
|
||||
pip install marker-pdf[full]
|
||||
```
|
||||
|
||||
# Usage
|
||||
|
||||
First, some configuration:
|
||||
|
||||
- Your torch device will be automatically detected, but you can override this. For example, `TORCH_DEVICE=cuda`.
|
||||
- Some PDFs, even digital ones, have bad text in them. Set `--force_ocr` to force OCR on all lines, or the `strip_existing_ocr` to keep all digital text, and strip out any existing OCR text.
|
||||
- If you care about inline math, set `force_ocr` to convert inline math to LaTeX.
|
||||
|
||||
## Interactive App
|
||||
|
||||
I've included a streamlit app that lets you interactively try marker with some basic options. Run it with:
|
||||
|
||||
```shell
|
||||
pip install streamlit streamlit-ace
|
||||
marker_gui
|
||||
```
|
||||
|
||||
## Convert a single file
|
||||
|
||||
```shell
|
||||
marker_single /path/to/file.pdf
|
||||
```
|
||||
|
||||
You can pass in PDFs or images.
|
||||
|
||||
Options:
|
||||
- `--page_range TEXT`: Specify which pages to process. Accepts comma-separated page numbers and ranges. Example: `--page_range "0,5-10,20"` will process pages 0, 5 through 10, and page 20.
|
||||
- `--output_format [markdown|json|html|chunks]`: Specify the format for the output results.
|
||||
- `--output_dir PATH`: Directory where output files will be saved. Defaults to the value specified in settings.OUTPUT_DIR.
|
||||
- `--paginate_output`: Paginates the output, using `\n\n{PAGE_NUMBER}` followed by `-` * 48, then `\n\n`
|
||||
- `--use_llm`: Uses an LLM to improve accuracy. You will need to configure the LLM backend - see [below](#llm-services).
|
||||
- `--force_ocr`: Force OCR processing on the entire document, even for pages that might contain extractable text. This will also format inline math properly.
|
||||
- `--block_correction_prompt`: if LLM mode is active, an optional prompt that will be used to correct the output of marker. This is useful for custom formatting or logic that you want to apply to the output.
|
||||
- `--strip_existing_ocr`: Remove all existing OCR text in the document and re-OCR with surya.
|
||||
- `--redo_inline_math`: If you want the absolute highest quality inline math conversion, use this along with `--use_llm`.
|
||||
- `--disable_image_extraction`: Don't extract images from the PDF. If you also specify `--use_llm`, then images will be replaced with a description.
|
||||
- `--debug`: Enable debug mode for additional logging and diagnostic information.
|
||||
- `--processors TEXT`: Override the default processors by providing their full module paths, separated by commas. Example: `--processors "module1.processor1,module2.processor2"`
|
||||
- `--config_json PATH`: Path to a JSON configuration file containing additional settings.
|
||||
- `config --help`: List all available builders, processors, and converters, and their associated configuration. These values can be used to build a JSON configuration file for additional tweaking of marker defaults.
|
||||
- `--converter_cls`: One of `marker.converters.pdf.PdfConverter` (default) or `marker.converters.table.TableConverter`. The `PdfConverter` will convert the whole PDF, the `TableConverter` will only extract and convert tables.
|
||||
- `--llm_service`: Which llm service to use if `--use_llm` is passed. This defaults to `marker.services.gemini.GoogleGeminiService`.
|
||||
- `--help`: see all of the flags that can be passed into marker. (it supports many more options then are listed above)
|
||||
|
||||
The list of supported languages for surya OCR is [here](https://github.com/VikParuchuri/surya/blob/master/surya/recognition/languages.py). If you don't need OCR, marker can work with any language.
|
||||
|
||||
## Convert multiple files
|
||||
|
||||
```shell
|
||||
marker /path/to/input/folder
|
||||
```
|
||||
|
||||
- `marker` supports all the same options from `marker_single` above.
|
||||
- `--workers` is the number of conversion workers to run simultaneously. This is automatically set by default, but you can increase it to increase throughput, at the cost of more CPU/GPU usage. Marker will use 5GB of VRAM per worker at the peak, and 3.5GB average.
|
||||
|
||||
## Convert multiple files on multiple GPUs
|
||||
|
||||
```shell
|
||||
NUM_DEVICES=4 NUM_WORKERS=15 marker_chunk_convert ../pdf_in ../md_out
|
||||
```
|
||||
|
||||
- `NUM_DEVICES` is the number of GPUs to use. Should be `2` or greater.
|
||||
- `NUM_WORKERS` is the number of parallel processes to run on each GPU.
|
||||
|
||||
## Use from python
|
||||
|
||||
See the `PdfConverter` class at `marker/converters/pdf.py` function for additional arguments that can be passed.
|
||||
|
||||
```python
|
||||
from marker.converters.pdf import PdfConverter
|
||||
from marker.models import create_model_dict
|
||||
from marker.output import text_from_rendered
|
||||
|
||||
converter = PdfConverter(
|
||||
artifact_dict=create_model_dict(),
|
||||
)
|
||||
rendered = converter("FILEPATH")
|
||||
text, _, images = text_from_rendered(rendered)
|
||||
```
|
||||
|
||||
`rendered` will be a pydantic basemodel with different properties depending on the output type requested. With markdown output (default), you'll have the properties `markdown`, `metadata`, and `images`. For json output, you'll have `children`, `block_type`, and `metadata`.
|
||||
|
||||
### Custom configuration
|
||||
|
||||
You can pass configuration using the `ConfigParser`. To see all available options, do `marker_single --help`.
|
||||
|
||||
```python
|
||||
from marker.converters.pdf import PdfConverter
|
||||
from marker.models import create_model_dict
|
||||
from marker.config.parser import ConfigParser
|
||||
|
||||
config = {
|
||||
"output_format": "json",
|
||||
"ADDITIONAL_KEY": "VALUE"
|
||||
}
|
||||
config_parser = ConfigParser(config)
|
||||
|
||||
converter = PdfConverter(
|
||||
config=config_parser.generate_config_dict(),
|
||||
artifact_dict=create_model_dict(),
|
||||
processor_list=config_parser.get_processors(),
|
||||
renderer=config_parser.get_renderer(),
|
||||
llm_service=config_parser.get_llm_service()
|
||||
)
|
||||
rendered = converter("FILEPATH")
|
||||
```
|
||||
|
||||
### Extract blocks
|
||||
|
||||
Each document consists of one or more pages. Pages contain blocks, which can themselves contain other blocks. It's possible to programmatically manipulate these blocks.
|
||||
|
||||
Here's an example of extracting all forms from a document:
|
||||
|
||||
```python
|
||||
from marker.converters.pdf import PdfConverter
|
||||
from marker.models import create_model_dict
|
||||
from marker.schema import BlockTypes
|
||||
|
||||
converter = PdfConverter(
|
||||
artifact_dict=create_model_dict(),
|
||||
)
|
||||
document = converter.build_document("FILEPATH")
|
||||
forms = document.contained_blocks((BlockTypes.Form,))
|
||||
```
|
||||
|
||||
Look at the processors for more examples of extracting and manipulating blocks.
|
||||
|
||||
## Other converters
|
||||
|
||||
You can also use other converters that define different conversion pipelines:
|
||||
|
||||
### Extract tables
|
||||
|
||||
The `TableConverter` will only convert and extract tables:
|
||||
|
||||
```python
|
||||
from marker.converters.table import TableConverter
|
||||
from marker.models import create_model_dict
|
||||
from marker.output import text_from_rendered
|
||||
|
||||
converter = TableConverter(
|
||||
artifact_dict=create_model_dict(),
|
||||
)
|
||||
rendered = converter("FILEPATH")
|
||||
text, _, images = text_from_rendered(rendered)
|
||||
```
|
||||
|
||||
This takes all the same configuration as the PdfConverter. You can specify the configuration `force_layout_block=Table` to avoid layout detection and instead assume every page is a table. Set `output_format=json` to also get cell bounding boxes.
|
||||
|
||||
You can also run this via the CLI with
|
||||
```shell
|
||||
marker_single FILENAME --use_llm --force_layout_block Table --converter_cls marker.converters.table.TableConverter --output_format json
|
||||
```
|
||||
|
||||
### OCR Only
|
||||
|
||||
If you only want to run OCR, you can also do that through the `OCRConverter`. Set `--keep_chars` to keep individual characters and bounding boxes.
|
||||
|
||||
```python
|
||||
from marker.converters.ocr import OCRConverter
|
||||
from marker.models import create_model_dict
|
||||
|
||||
converter = OCRConverter(
|
||||
artifact_dict=create_model_dict(),
|
||||
)
|
||||
rendered = converter("FILEPATH")
|
||||
```
|
||||
|
||||
This takes all the same configuration as the PdfConverter.
|
||||
|
||||
You can also run this via the CLI with
|
||||
```shell
|
||||
marker_single FILENAME --converter_cls marker.converters.ocr.OCRConverter
|
||||
```
|
||||
|
||||
### Structured Extraction (beta)
|
||||
|
||||
You can run structured extraction via the `ExtractionConverter`. This requires an llm service to be setup first (see [here](#llm-services) for details). You'll get a JSON output with the extracted values.
|
||||
|
||||
```python
|
||||
from marker.converters.extraction import ExtractionConverter
|
||||
from marker.models import create_model_dict
|
||||
from marker.config.parser import ConfigParser
|
||||
from pydantic import BaseModel
|
||||
|
||||
class Links(BaseModel):
|
||||
links: list[str]
|
||||
|
||||
schema = Links.model_json_schema()
|
||||
config_parser = ConfigParser({
|
||||
"page_schema": schema
|
||||
})
|
||||
|
||||
converter = ExtractionConverter(
|
||||
artifact_dict=create_model_dict(),
|
||||
config=config_parser.generate_config_dict(),
|
||||
llm_service=config_parser.get_llm_service(),
|
||||
)
|
||||
rendered = converter("FILEPATH")
|
||||
```
|
||||
|
||||
Rendered will have an `original_markdown` field. If you pass this back in next time you run the converter, as the `existing_markdown` config key, you can skip re-parsing the document.
|
||||
|
||||
# Output Formats
|
||||
|
||||
## Markdown
|
||||
|
||||
Markdown output will include:
|
||||
|
||||
- image links (images will be saved in the same folder)
|
||||
- formatted tables
|
||||
- embedded LaTeX equations (fenced with `$$`)
|
||||
- Code is fenced with triple backticks
|
||||
- Superscripts for footnotes
|
||||
|
||||
## HTML
|
||||
|
||||
HTML output is similar to markdown output:
|
||||
|
||||
- Images are included via `img` tags
|
||||
- equations are fenced with `<math>` tags
|
||||
- code is in `pre` tags
|
||||
|
||||
## JSON
|
||||
|
||||
JSON output will be organized in a tree-like structure, with the leaf nodes being blocks. Examples of leaf nodes are a single list item, a paragraph of text, or an image.
|
||||
|
||||
The output will be a list, with each list item representing a page. Each page is considered a block in the internal marker schema. There are different types of blocks to represent different elements.
|
||||
|
||||
Pages have the keys:
|
||||
|
||||
- `id` - unique id for the block.
|
||||
- `block_type` - the type of block. The possible block types can be seen in `marker/schema/__init__.py`. As of this writing, they are ["Line", "Span", "FigureGroup", "TableGroup", "ListGroup", "PictureGroup", "Page", "Caption", "Code", "Figure", "Footnote", "Form", "Equation", "Handwriting", "TextInlineMath", "ListItem", "PageFooter", "PageHeader", "Picture", "SectionHeader", "Table", "Text", "TableOfContents", "Document"]
|
||||
- `html` - the HTML for the page. Note that this will have recursive references to children. The `content-ref` tags must be replaced with the child content if you want the full html. You can see an example of this at `marker/output.py:json_to_html`. That function will take in a single block from the json output, and turn it into HTML.
|
||||
- `polygon` - the 4-corner polygon of the page, in (x1,y1), (x2,y2), (x3, y3), (x4, y4) format. (x1,y1) is the top left, and coordinates go clockwise.
|
||||
- `children` - the child blocks.
|
||||
|
||||
The child blocks have two additional keys:
|
||||
|
||||
- `section_hierarchy` - indicates the sections that the block is part of. `1` indicates an h1 tag, `2` an h2, and so on.
|
||||
- `images` - base64 encoded images. The key will be the block id, and the data will be the encoded image.
|
||||
|
||||
Note that child blocks of pages can have their own children as well (a tree structure).
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "/page/10/Page/366",
|
||||
"block_type": "Page",
|
||||
"html": "<content-ref src='/page/10/SectionHeader/0'></content-ref><content-ref src='/page/10/SectionHeader/1'></content-ref><content-ref src='/page/10/Text/2'></content-ref><content-ref src='/page/10/Text/3'></content-ref><content-ref src='/page/10/Figure/4'></content-ref><content-ref src='/page/10/SectionHeader/5'></content-ref><content-ref src='/page/10/SectionHeader/6'></content-ref><content-ref src='/page/10/TextInlineMath/7'></content-ref><content-ref src='/page/10/TextInlineMath/8'></content-ref><content-ref src='/page/10/Table/9'></content-ref><content-ref src='/page/10/SectionHeader/10'></content-ref><content-ref src='/page/10/Text/11'></content-ref>",
|
||||
"polygon": [[0.0, 0.0], [612.0, 0.0], [612.0, 792.0], [0.0, 792.0]],
|
||||
"children": [
|
||||
{
|
||||
"id": "/page/10/SectionHeader/0",
|
||||
"block_type": "SectionHeader",
|
||||
"html": "<h1>Supplementary Material for <i>Subspace Adversarial Training</i> </h1>",
|
||||
"polygon": [
|
||||
[217.845703125, 80.630859375], [374.73046875, 80.630859375],
|
||||
[374.73046875, 107.0],
|
||||
[217.845703125, 107.0]
|
||||
],
|
||||
"children": null,
|
||||
"section_hierarchy": {
|
||||
"1": "/page/10/SectionHeader/1"
|
||||
},
|
||||
"images": {}
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
```
|
||||
|
||||
## Chunks
|
||||
|
||||
Chunks format is similar to JSON, but flattens everything into a single list instead of a tree. Only the top level blocks from each page show up. It also has the full HTML of each block inside, so you don't need to crawl the tree to reconstruct it. This enable flexible and easy chunking for RAG.
|
||||
|
||||
## Metadata
|
||||
|
||||
All output formats will return a metadata dictionary, with the following fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"table_of_contents": [
|
||||
{
|
||||
"title": "Introduction",
|
||||
"heading_level": 1,
|
||||
"page_id": 0,
|
||||
"polygon": [...]
|
||||
}
|
||||
], // computed PDF table of contents
|
||||
"page_stats": [
|
||||
{
|
||||
"page_id": 0,
|
||||
"text_extraction_method": "pdftext",
|
||||
"block_counts": [("Span", 200), ...]
|
||||
},
|
||||
...
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
# LLM Services
|
||||
|
||||
When running with the `--use_llm` flag, you have a choice of services you can use:
|
||||
|
||||
- `Gemini` - this will use the Gemini developer API by default. You'll need to pass `--gemini_api_key` to configuration.
|
||||
- `Google Vertex` - this will use vertex, which can be more reliable. You'll need to pass `--vertex_project_id`. To use it, set `--llm_service=marker.services.vertex.GoogleVertexService`.
|
||||
- `Ollama` - this will use local models. You can configure `--ollama_base_url` and `--ollama_model`. To use it, set `--llm_service=marker.services.ollama.OllamaService`.
|
||||
- `Claude` - this will use the anthropic API. You can configure `--claude_api_key`, and `--claude_model_name`. To use it, set `--llm_service=marker.services.claude.ClaudeService`.
|
||||
- `OpenAI` - this supports any openai-like endpoint. You can configure `--openai_api_key`, `--openai_model`, and `--openai_base_url`. To use it, set `--llm_service=marker.services.openai.OpenAIService`.
|
||||
- `Azure OpenAI` - this uses the Azure OpenAI service. You can configure `--azure_endpoint`, `--azure_api_key`, and `--deployment_name`. To use it, set `--llm_service=marker.services.azure_openai.AzureOpenAIService`.
|
||||
|
||||
These services may have additional optional configuration as well - you can see it by viewing the classes.
|
||||
|
||||
# Internals
|
||||
|
||||
Marker is easy to extend. The core units of marker are:
|
||||
|
||||
- `Providers`, at `marker/providers`. These provide information from a source file, like a PDF.
|
||||
- `Builders`, at `marker/builders`. These generate the initial document blocks and fill in text, using info from the providers.
|
||||
- `Processors`, at `marker/processors`. These process specific blocks, for example the table formatter is a processor.
|
||||
- `Renderers`, at `marker/renderers`. These use the blocks to render output.
|
||||
- `Schema`, at `marker/schema`. The classes for all the block types.
|
||||
- `Converters`, at `marker/converters`. They run the whole end to end pipeline.
|
||||
|
||||
To customize processing behavior, override the `processors`. To add new output formats, write a new `renderer`. For additional input formats, write a new `provider.`
|
||||
|
||||
Processors and renderers can be directly passed into the base `PDFConverter`, so you can specify your own custom processing easily.
|
||||
|
||||
## API server
|
||||
|
||||
There is a very simple API server you can run like this:
|
||||
|
||||
```shell
|
||||
pip install -U uvicorn fastapi python-multipart
|
||||
marker_server --port 8001
|
||||
```
|
||||
|
||||
This will start a fastapi server that you can access at `localhost:8001`. You can go to `localhost:8001/docs` to see the endpoint options.
|
||||
|
||||
You can send requests like this:
|
||||
|
||||
```
|
||||
import requests
|
||||
import json
|
||||
|
||||
post_data = {
|
||||
'filepath': 'FILEPATH',
|
||||
# Add other params here
|
||||
}
|
||||
|
||||
requests.post("http://localhost:8001/marker", data=json.dumps(post_data)).json()
|
||||
```
|
||||
|
||||
Note that this is not a very robust API, and is only intended for small-scale use. If you want to use this server, but want a more robust conversion option, you can use the hosted [Datalab API](https://www.datalab.to/plans).
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
There are some settings that you may find useful if things aren't working the way you expect:
|
||||
|
||||
- If you have issues with accuracy, try setting `--use_llm` to use an LLM to improve quality. You must set `GOOGLE_API_KEY` to a Gemini API key for this to work.
|
||||
- Make sure to set `force_ocr` if you see garbled text - this will re-OCR the document.
|
||||
- `TORCH_DEVICE` - set this to force marker to use a given torch device for inference.
|
||||
- If you're getting out of memory errors, decrease worker count. You can also try splitting up long PDFs into multiple files.
|
||||
|
||||
## Debugging
|
||||
|
||||
Pass the `debug` option to activate debug mode. This will save images of each page with detected layout and text, as well as output a json file with additional bounding box information.
|
||||
|
||||
# Benchmarks
|
||||
|
||||
## Overall PDF Conversion
|
||||
|
||||
We created a [benchmark set](https://huggingface.co/datasets/datalab-to/marker_benchmark) by extracting single PDF pages from common crawl. We scored based on a heuristic that aligns text with ground truth text segments, and an LLM as a judge scoring method.
|
||||
|
||||
| Method | Avg Time | Heuristic Score | LLM Score |
|
||||
|------------|----------|-----------------|-----------|
|
||||
| marker | 2.83837 | 95.6709 | 4.23916 |
|
||||
| llamaparse | 23.348 | 84.2442 | 3.97619 |
|
||||
| mathpix | 6.36223 | 86.4281 | 4.15626 |
|
||||
| docling | 3.69949 | 86.7073 | 3.70429 |
|
||||
|
||||
Benchmarks were run on an H100 for markjer and docling - llamaparse and mathpix used their cloud services. We can also look at it by document type:
|
||||
|
||||
<img src="data/images/per_doc.png" width="1000px"/>
|
||||
|
||||
| Document Type | Marker heuristic | Marker LLM | Llamaparse Heuristic | Llamaparse LLM | Mathpix Heuristic | Mathpix LLM | Docling Heuristic | Docling LLM |
|
||||
|----------------------|------------------|------------|----------------------|----------------|-------------------|-------------|-------------------|-------------|
|
||||
| Scientific paper | 96.6737 | 4.34899 | 87.1651 | 3.96421 | 91.2267 | 4.46861 | 92.135 | 3.72422 |
|
||||
| Book page | 97.1846 | 4.16168 | 90.9532 | 4.07186 | 93.8886 | 4.35329 | 90.0556 | 3.64671 |
|
||||
| Other | 95.1632 | 4.25076 | 81.1385 | 4.01835 | 79.6231 | 4.00306 | 83.8223 | 3.76147 |
|
||||
| Form | 88.0147 | 3.84663 | 66.3081 | 3.68712 | 64.7512 | 3.33129 | 68.3857 | 3.40491 |
|
||||
| Presentation | 95.1562 | 4.13669 | 81.2261 | 4 | 83.6737 | 3.95683 | 84.8405 | 3.86331 |
|
||||
| Financial document | 95.3697 | 4.39106 | 82.5812 | 4.16111 | 81.3115 | 4.05556 | 86.3882 | 3.8 |
|
||||
| Letter | 98.4021 | 4.5 | 93.4477 | 4.28125 | 96.0383 | 4.45312 | 92.0952 | 4.09375 |
|
||||
| Engineering document | 93.9244 | 4.04412 | 77.4854 | 3.72059 | 80.3319 | 3.88235 | 79.6807 | 3.42647 |
|
||||
| Legal document | 96.689 | 4.27759 | 86.9769 | 3.87584 | 91.601 | 4.20805 | 87.8383 | 3.65552 |
|
||||
| Newspaper page | 98.8733 | 4.25806 | 84.7492 | 3.90323 | 96.9963 | 4.45161 | 92.6496 | 3.51613 |
|
||||
| Magazine page | 98.2145 | 4.38776 | 87.2902 | 3.97959 | 93.5934 | 4.16327 | 93.0892 | 4.02041 |
|
||||
|
||||
## Throughput
|
||||
|
||||
We benchmarked throughput using a [single long PDF](https://www.greenteapress.com/thinkpython/thinkpython.pdf).
|
||||
|
||||
| Method | Time per page | Time per document | VRAM used |
|
||||
|---------|---------------|-------------------|---------- |
|
||||
| marker | 0.18 | 43.42 | 3.17GB |
|
||||
|
||||
The projected throughput is 122 pages per second on an H100 - we can run 22 individual processes given the VRAM used.
|
||||
|
||||
## Table Conversion
|
||||
|
||||
Marker can extract tables from PDFs using `marker.converters.table.TableConverter`. The table extraction performance is measured by comparing the extracted HTML representation of tables against the original HTML representations using the test split of [FinTabNet](https://developer.ibm.com/exchanges/data/all/fintabnet/). The HTML representations are compared using a tree edit distance based metric to judge both structure and content. Marker detects and identifies the structure of all tables in a PDF page and achieves these scores:
|
||||
|
||||
| Method | Avg score | Total tables |
|
||||
|------------------|-----------|--------------|
|
||||
| marker | 0.816 | 99 |
|
||||
| marker w/use_llm | 0.907 | 99 |
|
||||
| gemini | 0.829 | 99 |
|
||||
|
||||
The `--use_llm` flag can significantly improve table recognition performance, as you can see.
|
||||
|
||||
We filter out tables that we cannot align with the ground truth, since fintabnet and our layout model have slightly different detection methods (this results in some tables being split/merged).
|
||||
|
||||
## Running your own benchmarks
|
||||
|
||||
You can benchmark the performance of marker on your machine. Install marker manually with:
|
||||
|
||||
```shell
|
||||
git clone https://github.com/VikParuchuri/marker.git
|
||||
poetry install
|
||||
```
|
||||
|
||||
### Overall PDF Conversion
|
||||
|
||||
Download the benchmark data [here](https://drive.google.com/file/d/1ZSeWDo2g1y0BRLT7KnbmytV2bjWARWba/view?usp=sharing) and unzip. Then run the overall benchmark like this:
|
||||
|
||||
```shell
|
||||
python benchmarks/overall.py --methods marker --scores heuristic,llm
|
||||
```
|
||||
|
||||
Options:
|
||||
|
||||
- `--use_llm` use an llm to improve the marker results.
|
||||
- `--max_rows` how many rows to process for the benchmark.
|
||||
- `--methods` can be `llamaparse`, `mathpix`, `docling`, `marker`. Comma separated.
|
||||
- `--scores` which scoring functions to use, can be `llm`, `heuristic`. Comma separated.
|
||||
|
||||
### Table Conversion
|
||||
The processed FinTabNet dataset is hosted [here](https://huggingface.co/datasets/datalab-to/fintabnet-test) and is automatically downloaded. Run the benchmark with:
|
||||
|
||||
```shell
|
||||
python benchmarks/table/table.py --max_rows 100
|
||||
```
|
||||
|
||||
Options:
|
||||
|
||||
- `--use_llm` uses an llm with marker to improve accuracy.
|
||||
- `--use_gemini` also benchmarks gemini 2.0 flash.
|
||||
|
||||
# How it works
|
||||
|
||||
Marker is a pipeline of deep learning models:
|
||||
|
||||
- Extract text, OCR if necessary (heuristics, [surya](https://github.com/VikParuchuri/surya))
|
||||
- Detect page layout and find reading order ([surya](https://github.com/VikParuchuri/surya))
|
||||
- Clean and format each block (heuristics, [texify](https://github.com/VikParuchuri/texify), [surya](https://github.com/VikParuchuri/surya))
|
||||
- Optionally use an LLM to improve quality
|
||||
- Combine blocks and postprocess complete text
|
||||
|
||||
It only uses models where necessary, which improves speed and accuracy.
|
||||
|
||||
# Limitations
|
||||
|
||||
PDF is a tricky format, so marker will not always work perfectly. Here are some known limitations that are on the roadmap to address:
|
||||
|
||||
- Very complex layouts, with nested tables and forms, may not work
|
||||
- Forms may not be rendered well
|
||||
|
||||
Note: Passing the `--use_llm` and `--force_ocr` flags will mostly solve these issues.
|
||||
|
||||
# Usage and Deployment Examples
|
||||
|
||||
You can always run `marker` locally, but if you wanted to expose it as an API, we have a few options:
|
||||
- Our platform API which is powered by `marker` and `surya` and is easy to test out - it's free to sign up, and we'll include credits, [try it out here](https://datalab.to)
|
||||
- Our painless on-prem solution for commercial use, which you can [read about here](https://www.datalab.to/blog/self-serve-on-prem-licensing) and gives you privacy guarantees with high throughput inference optimizations.
|
||||
- [Deployment example with Modal](./examples/README_MODAL.md) that shows you how to deploy and access `marker` through a web endpoint using [`Modal`](https://modal.com). Modal is an AI compute platform that enables developers to deploy and scale models on GPUs in minutes.
|
||||
@@ -9,20 +9,135 @@ AI-generated cross-book summaries, and engage in grounded RAG chat.
|
||||
```mermaid
|
||||
graph TD
|
||||
User["Neurosurgeon (Browser)"]
|
||||
Login["Login Page\n(username + password form)"]
|
||||
FE["Frontend\nVue.js 3 / Vite\n:5173"]
|
||||
BE["Backend\nSpring Boot 4 / Spring AI\n:8080"]
|
||||
DB["PostgreSQL + pgvector\n(provided)"]
|
||||
LLM["LLM Provider\n(OpenAI / configurable)"]
|
||||
Auth["Spring Security\nHTTP Basic Auth"]
|
||||
DB["PostgreSQL + pgvector\n(source of truth)"]
|
||||
FS["File Store\nuploads/ (local disk)\nExtracted figure PNGs"]
|
||||
LLM["LLM Provider\n(OpenAI)\nEmbeddings + Chat + Vision"]
|
||||
|
||||
User -->|HTTP| FE
|
||||
FE -->|REST /api/v1/...| BE
|
||||
BE -->|JDBC / pgvector| DB
|
||||
BE -->|Embedding + Chat API| LLM
|
||||
User -->|"First visit / unauthenticated"| Login
|
||||
Login -->|"POST credentials\n(GET /api/v1/auth/check)"| Auth
|
||||
Auth -->|"401 → back to login\n200 → app access"| Login
|
||||
Login -->|"Authenticated"| FE
|
||||
FE -->|"REST /api/v1/...\n(HTTP Basic on every request)"| Auth
|
||||
Auth --> BE
|
||||
BE -->|"JDBC — books, chapters,\nsections, figures, refs"| DB
|
||||
BE -->|"pgvector — text chunks\n+ figure caption vectors"| DB
|
||||
BE -->|"PNG read/write\n(figure extraction)"| FS
|
||||
FE -->|"GET /api/v1/figures/**\n(static file serving)"| BE
|
||||
BE -->|"Embedding + Chat\n+ Vision (image description)"| LLM
|
||||
|
||||
subgraph "Embedding Pipeline (per PDF upload)"
|
||||
EP1["Parse pages → SectionEntity"]
|
||||
EP2["Extract images → FigureEntity"]
|
||||
EP3["Vision describe → embed caption"]
|
||||
EP4["Chunk text → embed chunks"]
|
||||
EP5["Link chunks ↔ figures"]
|
||||
EP1 --> EP2
|
||||
EP1 --> EP4
|
||||
EP2 --> EP3
|
||||
EP4 --> EP5
|
||||
EP3 --> EP5
|
||||
end
|
||||
|
||||
subgraph "Retrieval Pipeline (per chat query)"
|
||||
RP0["Query expansion\n(QueryExpansionService)\nlay → clinical terms"]
|
||||
RP1["Text chunk search (topK=5)"]
|
||||
RP2["Figure caption search (topK=3)"]
|
||||
RP3["Expand chunks → ±1-page section text"]
|
||||
RP4["Fetch linked figures (chunk_figure_ref)"]
|
||||
RP5["Merge + deduplicate figures"]
|
||||
RP6["Build labelled prompt\n[S1],[F1]… tags"]
|
||||
RP7["LLM chat call"]
|
||||
RP8["Citation validation\n(CitationValidatorService)\nstrip hallucinated refs"]
|
||||
RP0 --> RP1
|
||||
RP0 --> RP2
|
||||
RP1 --> RP3
|
||||
RP1 --> RP4
|
||||
RP2 --> RP5
|
||||
RP4 --> RP5
|
||||
RP3 --> RP6
|
||||
RP5 --> RP6
|
||||
RP6 --> RP7
|
||||
RP7 --> RP8
|
||||
end
|
||||
```
|
||||
|
||||
## Marker API Response Structure
|
||||
|
||||
The PDF parsing pipeline calls a local [Marker](https://github.com/VikParuchuri/marker) server (`POST /marker/upload`).
|
||||
|
||||
### Top-level envelope
|
||||
|
||||
```json
|
||||
{
|
||||
"format": "json",
|
||||
"output": "<JSON-encoded string>"
|
||||
}
|
||||
```
|
||||
|
||||
`output` is a **JSON-encoded string** (not a nested object) and must be parsed a second time to get the document tree.
|
||||
|
||||
### Parsed `output` shape
|
||||
|
||||
```
|
||||
{
|
||||
"children": [ <Page block>, ... ]
|
||||
}
|
||||
```
|
||||
|
||||
### Block types
|
||||
|
||||
Every block shares these fields:
|
||||
|
||||
| Field | Type | Notes |
|
||||
|------------------|-------------------|--------------------------------------------|
|
||||
| `id` | string | e.g. `/page/0/Picture/2` |
|
||||
| `block_type` | string | see table below |
|
||||
| `html` | string | rendered HTML; may contain `<content-ref>` |
|
||||
| `bbox` | `[x0,y0,x1,y1]` | bounding box in page coordinates |
|
||||
| `children` | array or null | nested blocks |
|
||||
| `images` | object or null | base64 PNG map (leaf image blocks only) |
|
||||
| `section_hierarchy` | object | heading ancestry |
|
||||
|
||||
#### Known `block_type` values
|
||||
|
||||
| block_type | Category | Notes |
|
||||
|------------------|----------|-------------------------------------------------------|
|
||||
| `Page` | structure | Top-level; direct children are the page content |
|
||||
| `SectionHeader` | text | Section / chapter heading |
|
||||
| `Text` | text | |
|
||||
| `TextInlineMath` | text | |
|
||||
| `ListItem` | text | |
|
||||
| `Table` | text | |
|
||||
| `Code` | text | |
|
||||
| `Equation` | text | |
|
||||
| `Footnote` | text | |
|
||||
| `Caption` | text | Usually a child of a `*Group` block |
|
||||
| `PageHeader` | text | |
|
||||
| `PageFooter` | text | |
|
||||
| `Handwriting` | text | |
|
||||
| `Picture` | image | Leaf block; `images` map holds base64 PNG keyed by ID |
|
||||
| `Figure` | image | Leaf block; same as `Picture` |
|
||||
| `PictureGroup` | container | Wraps one `Picture` + one `Caption` child |
|
||||
| `FigureGroup` | container | Wraps one `Figure` + one `Caption` child |
|
||||
|
||||
### Image extraction
|
||||
|
||||
Images are only present on **leaf** image blocks (`Picture`, `Figure`).
|
||||
Group blocks (`PictureGroup`, `FigureGroup`) have `images: null` — the base64 PNG lives on the child leaf block.
|
||||
|
||||
```
|
||||
PictureGroup
|
||||
├── Picture ← images: { "/page/0/Picture/2": "<base64 PNG>" }
|
||||
└── Caption ← html: "<p>Figure 1 — ...</p>"
|
||||
```
|
||||
|
||||
## Stack
|
||||
|
||||
- **Backend**: Spring Boot 4.0.5 + Spring AI 2.0.0-M4, Java 21, Maven
|
||||
- **Backend**: Spring Boot 4.0.5 + Spring AI 2.0.0-M4, Java 25, Maven
|
||||
- **Frontend**: Vue.js 3 + Vite + TypeScript + Pinia + Axios
|
||||
- **Database**: PostgreSQL 16 + pgvector extension
|
||||
- **Auth**: HTTP Basic (single shared in-memory user)
|
||||
@@ -31,7 +146,7 @@ graph TD
|
||||
|
||||
See [specs/001-neuro-rag-learning/quickstart.md](specs/001-neuro-rag-learning/quickstart.md) for full instructions.
|
||||
|
||||
### Local Dev
|
||||
### Local Dev (JVM)
|
||||
|
||||
```bash
|
||||
# Start the database
|
||||
@@ -47,8 +162,73 @@ npm install
|
||||
npm run dev
|
||||
```
|
||||
|
||||
### Native Image Build
|
||||
|
||||
Produces a GraalVM native binary packaged into a minimal Docker image via Jib.
|
||||
|
||||
**Prerequisite**: GraalVM 25 must be installed and set as `JAVA_HOME`.
|
||||
|
||||
```bash
|
||||
# Install GraalVM 25 CE via sdkman (one-time)
|
||||
sdk install java 25-graalce
|
||||
sdk use java 25-graalce
|
||||
|
||||
# Build native executable + Docker image (requires Docker daemon)
|
||||
cd backend
|
||||
mvn -Pnative package jib:build -DskipTests
|
||||
mvn -Pnative jib:build -Djib.to.auth.username=admin -Djib.to.auth.password=""
|
||||
```
|
||||
|
||||
### Frontend build
|
||||
```
|
||||
buildah build \
|
||||
--platform linux/arm64 \
|
||||
--tag zot.immich-ad.ovh/ai-teacher-frontend:latest \
|
||||
frontend/
|
||||
buildah login zot.immich-ad.ovh
|
||||
```
|
||||
Push to the private repository:
|
||||
|
||||
```
|
||||
buildah push --tls-verify=false zot.immich-ad.ovh/ai-teacher-frontend:latest
|
||||
|
||||
```
|
||||
|
||||
### Run Native Stack (Docker Compose)
|
||||
|
||||
```bash
|
||||
# Copy and fill in secrets
|
||||
cp .env.example .env
|
||||
# edit .env — add OPENAI_API_KEY at minimum
|
||||
|
||||
# Start PostgreSQL + native backend
|
||||
docker compose -f docker-compose.native.yml up
|
||||
```
|
||||
|
||||
App available at `http://localhost:8080`.
|
||||
|
||||
### Build Pipeline (Native)
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
SRC["Source Code\n(Java 25)"]
|
||||
AOT["Spring Boot AOT\n(process-aot)"]
|
||||
NI["GraalVM native-image\n(native-maven-plugin)"]
|
||||
EXE["Native Executable\ntarget/ai-teacher-backend"]
|
||||
JIB["Jib\n(jib-native-image-extension)"]
|
||||
IMG["Docker Image\nai-teacher-backend:latest\n(distroless base)"]
|
||||
|
||||
SRC --> AOT
|
||||
AOT --> NI
|
||||
NI --> EXE
|
||||
EXE --> JIB
|
||||
JIB --> IMG
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
#### Backend
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `OPENAI_API_KEY` | Yes | OpenAI API key for embeddings and chat |
|
||||
@@ -56,3 +236,15 @@ npm run dev
|
||||
| `DB_URL` | Yes | JDBC URL, e.g. `jdbc:postgresql://localhost:5432/aiteacher` |
|
||||
| `DB_USERNAME` | Yes | Database username |
|
||||
| `DB_PASSWORD` | Yes | Database password |
|
||||
| `FIGURE_STORAGE_PATH` | No | Base path for uploaded PDFs and extracted figures (default: `./uploads`) |
|
||||
| `UPLOAD_ENABLED` | No | Set to `false` to disable the book upload endpoint (default: `true`) |
|
||||
| `DELETE_ENABLED` | No | Set to `false` to disable the book delete endpoint (default: `true`) |
|
||||
|
||||
#### Frontend
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `VITE_API_URL` | No | Backend API base URL (default: `/api/v1`) |
|
||||
| `VITE_APP_PASSWORD` | Yes | Shared password for HTTP Basic auth (must match `APP_PASSWORD`) |
|
||||
| `VITE_UPLOAD_ENABLED` | No | Set to `false` to hide the upload UI (default: `true`) |
|
||||
| `VITE_DELETE_ENABLED` | No | Set to `false` to hide the delete button (default: `true`) |
|
||||
|
||||
@@ -0,0 +1,24 @@
|
||||
# Java build artifacts
|
||||
target/
|
||||
*.class
|
||||
*.jar
|
||||
|
||||
# Git
|
||||
.git/
|
||||
.gitignore
|
||||
|
||||
# Editor
|
||||
.vscode/
|
||||
.idea/
|
||||
*.iml
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Logs
|
||||
*.log
|
||||
|
||||
# Environment
|
||||
.env
|
||||
.env.*
|
||||
@@ -0,0 +1,25 @@
|
||||
# ---- Pull Maven from its official image (avoids microdnf under QEMU) ----
|
||||
FROM docker.io/library/maven:3.9.9-eclipse-temurin-21 AS maven-dist
|
||||
|
||||
# ---- Build stage: GraalVM 25 + Maven ----
|
||||
FROM ghcr.io/graalvm/native-image-community:25 AS build
|
||||
|
||||
# Copy Maven from the official Maven image — no package installation needed
|
||||
COPY --from=maven-dist /usr/share/maven /opt/maven
|
||||
ENV PATH="/opt/maven/bin:$PATH"
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Cache dependency resolution separately from source compilation
|
||||
COPY pom.xml .
|
||||
RUN mvn -Pnative dependency:resolve dependency:resolve-plugins -q
|
||||
|
||||
# Build native executable
|
||||
COPY src ./src
|
||||
RUN mvn -Pnative package -DskipTests
|
||||
|
||||
# ---- Runtime stage: slim Debian with glibc + libz (required by GraalVM native binary) ----
|
||||
FROM docker.io/library/debian:12-slim
|
||||
COPY --from=build /app/target/ai-teacher-backend /app/ai-teacher-backend
|
||||
EXPOSE 8080
|
||||
ENTRYPOINT ["/app/ai-teacher-backend"]
|
||||
+126
-2
@@ -32,6 +32,13 @@
|
||||
<type>pom</type>
|
||||
<scope>import</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>software.amazon.awssdk</groupId>
|
||||
<artifactId>bom</artifactId>
|
||||
<version>2.30.14</version>
|
||||
<type>pom</type>
|
||||
<scope>import</scope>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
</dependencyManagement>
|
||||
|
||||
@@ -95,12 +102,25 @@
|
||||
<artifactId>spring-ai-advisors-vector-store</artifactId>
|
||||
</dependency>
|
||||
|
||||
<!-- Spring AI — PDF document reader -->
|
||||
<!-- Spring AI — PDF document reader (includes PDFBox transitively) -->
|
||||
<dependency>
|
||||
<groupId>org.springframework.ai</groupId>
|
||||
<artifactId>spring-ai-pdf-document-reader</artifactId>
|
||||
</dependency>
|
||||
|
||||
<!-- PDFBox — page rendering and cropping for figure extraction -->
|
||||
<dependency>
|
||||
<groupId>org.apache.pdfbox</groupId>
|
||||
<artifactId>pdfbox</artifactId>
|
||||
<version>3.0.3</version>
|
||||
</dependency>
|
||||
|
||||
<!-- AWS SDK v2 — S3 figure storage -->
|
||||
<dependency>
|
||||
<groupId>software.amazon.awssdk</groupId>
|
||||
<artifactId>s3</artifactId>
|
||||
</dependency>
|
||||
|
||||
<!-- Jackson (JSON) -->
|
||||
<dependency>
|
||||
<groupId>com.fasterxml.jackson.core</groupId>
|
||||
@@ -120,15 +140,119 @@
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
|
||||
<build>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.graalvm.buildtools</groupId>
|
||||
<artifactId>native-maven-plugin</artifactId>
|
||||
</plugin>
|
||||
|
||||
<plugin>
|
||||
<groupId>org.springframework.boot</groupId>
|
||||
<artifactId>spring-boot-maven-plugin</artifactId>
|
||||
</plugin>
|
||||
|
||||
<!-- Jib — package native executable (or fat-jar) into Docker image -->
|
||||
<plugin>
|
||||
<groupId>com.google.cloud.tools</groupId>
|
||||
<artifactId>jib-maven-plugin</artifactId>
|
||||
<version>3.5.1</version>
|
||||
<configuration>
|
||||
<from>
|
||||
<!-- distroless glibc base — includes libz + libssl needed by GraalVM native binary -->
|
||||
<image>gcr.io/distroless/base-debian12</image>
|
||||
</from>
|
||||
<to>
|
||||
<image>zot.immich-ad.ovh/ai-teacher-backend</image>
|
||||
<tags>
|
||||
<tag>latest</tag>
|
||||
</tags>
|
||||
</to>
|
||||
<container>
|
||||
<format>OCI</format>
|
||||
<ports>
|
||||
<port>8080</port>
|
||||
</ports>
|
||||
<!-- invoke the native binary directly — no JVM -->
|
||||
<entrypoint>
|
||||
<arg>/app/ai-teacher-backend</arg>
|
||||
</entrypoint>
|
||||
</container>
|
||||
<!-- copy the GraalVM-compiled binary from target/ into /app/ -->
|
||||
<extraDirectories>
|
||||
<paths>
|
||||
<path>
|
||||
<from>${project.build.directory}</from>
|
||||
<into>/app</into>
|
||||
<includes>ai-teacher-backend</includes>
|
||||
</path>
|
||||
</paths>
|
||||
<permissions>
|
||||
<permission>
|
||||
<file>/app/ai-teacher-backend</file>
|
||||
<mode>755</mode>
|
||||
</permission>
|
||||
</permissions>
|
||||
</extraDirectories>
|
||||
</configuration>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
|
||||
<profiles>
|
||||
<profile>
|
||||
<id>native</id>
|
||||
<build>
|
||||
<plugins>
|
||||
|
||||
<!-- skip jib in native builds — use Dockerfile.native + buildah instead -->
|
||||
<plugin>
|
||||
<groupId>com.google.cloud.tools</groupId>
|
||||
<artifactId>jib-maven-plugin</artifactId>
|
||||
<configuration>
|
||||
<skip>true</skip>
|
||||
</configuration>
|
||||
</plugin>
|
||||
|
||||
<!-- GraalVM native-image compilation -->
|
||||
<plugin>
|
||||
<groupId>org.graalvm.buildtools</groupId>
|
||||
<artifactId>native-maven-plugin</artifactId>
|
||||
<version>1.0.0</version>
|
||||
<executions>
|
||||
<execution>
|
||||
<id>add-reachability-metadata</id>
|
||||
<goals>
|
||||
<goal>add-reachability-metadata</goal>
|
||||
</goals>
|
||||
</execution>
|
||||
<execution>
|
||||
<id>compile</id>
|
||||
<goals>
|
||||
<goal>compile-no-fork</goal>
|
||||
</goals>
|
||||
<phase>package</phase>
|
||||
</execution>
|
||||
</executions>
|
||||
<configuration>
|
||||
<imageName>ai-teacher-backend</imageName>
|
||||
<buildArgs>
|
||||
<buildArg>--initialize-at-build-time=org.slf4j,ch.qos.logback</buildArg>
|
||||
<buildArg>-H:+ReportExceptionStackTraces</buildArg>
|
||||
<buildArg>--gc=serial</buildArg>
|
||||
<buildArg>-Os</buildArg>
|
||||
<buildArg>-H:+RemoveUnusedSymbols</buildArg>
|
||||
<buildArg>-H:-EnableLoggingFeature</buildArg>
|
||||
<buildArg>-R:MaxHeapSize=128m</buildArg>
|
||||
<buildArg>-R:MinHeapSize=32m</buildArg>
|
||||
<!-- Limit native-image compiler RAM (build time, not runtime) -->
|
||||
<buildArg>-J-Xmx8g</buildArg>
|
||||
</buildArgs>
|
||||
</configuration>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
</profile>
|
||||
</profiles>
|
||||
|
||||
</project>
|
||||
|
||||
@@ -1,11 +1,15 @@
|
||||
package com.aiteacher;
|
||||
|
||||
import org.springframework.context.annotation.ImportRuntimeHints;
|
||||
import org.springframework.boot.SpringApplication;
|
||||
import org.springframework.boot.autoconfigure.SpringBootApplication;
|
||||
import org.springframework.scheduling.annotation.EnableAsync;
|
||||
|
||||
import com.aiteacher.config.NativeHintsConfig;
|
||||
|
||||
@SpringBootApplication
|
||||
@EnableAsync
|
||||
@ImportRuntimeHints(NativeHintsConfig.class)
|
||||
public class AiTeacherApplication {
|
||||
|
||||
public static void main(String[] args) {
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
package com.aiteacher.auth;
|
||||
|
||||
import org.springframework.http.ResponseEntity;
|
||||
import org.springframework.web.bind.annotation.GetMapping;
|
||||
import org.springframework.web.bind.annotation.RequestMapping;
|
||||
import org.springframework.web.bind.annotation.RestController;
|
||||
|
||||
import java.security.Principal;
|
||||
import java.util.Map;
|
||||
|
||||
@RestController
|
||||
@RequestMapping("/api/v1/auth")
|
||||
public class AuthController {
|
||||
|
||||
@GetMapping("/check")
|
||||
public ResponseEntity<Map<String, String>> check(Principal principal) {
|
||||
return ResponseEntity.ok(Map.of("username", principal.getName()));
|
||||
}
|
||||
}
|
||||
@@ -1,6 +1,11 @@
|
||||
package com.aiteacher.book;
|
||||
|
||||
import com.aiteacher.document.FigureEntity;
|
||||
import com.aiteacher.document.FigureRepository;
|
||||
import com.aiteacher.document.MarkdownStorageService;
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.http.HttpStatus;
|
||||
import org.springframework.http.MediaType;
|
||||
import org.springframework.http.ResponseEntity;
|
||||
import org.springframework.web.bind.annotation.*;
|
||||
import org.springframework.web.multipart.MultipartFile;
|
||||
@@ -15,13 +20,25 @@ import java.util.UUID;
|
||||
public class BookController {
|
||||
|
||||
private final BookService bookService;
|
||||
private final FigureRepository figureRepository;
|
||||
private final MarkdownStorageService markdownStorageService;
|
||||
|
||||
public BookController(BookService bookService) {
|
||||
@Value("${app.features.upload-enabled:true}")
|
||||
private boolean uploadEnabled;
|
||||
|
||||
@Value("${app.features.delete-enabled:true}")
|
||||
private boolean deleteEnabled;
|
||||
|
||||
public BookController(BookService bookService, FigureRepository figureRepository,
|
||||
MarkdownStorageService markdownStorageService) {
|
||||
this.bookService = bookService;
|
||||
this.figureRepository = figureRepository;
|
||||
this.markdownStorageService = markdownStorageService;
|
||||
}
|
||||
|
||||
@PostMapping(consumes = "multipart/form-data")
|
||||
public ResponseEntity<?> upload(@RequestParam("file") MultipartFile file) throws IOException {
|
||||
if (!uploadEnabled) return ResponseEntity.status(HttpStatus.METHOD_NOT_ALLOWED).build();
|
||||
Book book = bookService.upload(file);
|
||||
return ResponseEntity.status(HttpStatus.ACCEPTED).body(toSummaryResponse(book));
|
||||
}
|
||||
@@ -42,10 +59,52 @@ public class BookController {
|
||||
|
||||
@DeleteMapping("/{id}")
|
||||
public ResponseEntity<Void> delete(@PathVariable UUID id) {
|
||||
if (!deleteEnabled) return ResponseEntity.status(HttpStatus.METHOD_NOT_ALLOWED).build();
|
||||
bookService.delete(id);
|
||||
return ResponseEntity.noContent().build();
|
||||
}
|
||||
|
||||
@PostMapping("/{id}/reembed")
|
||||
public ResponseEntity<Map<String, Object>> reembed(@PathVariable UUID id) {
|
||||
Book book = bookService.reembed(id);
|
||||
return ResponseEntity.accepted().body(Map.of(
|
||||
"bookId", book.getId(),
|
||||
"status", BookStatus.PROCESSING.name()
|
||||
));
|
||||
}
|
||||
|
||||
@GetMapping(value = "/{id}/pages/{pageNumber}/html", produces = MediaType.TEXT_HTML_VALUE)
|
||||
public ResponseEntity<String> getPageHtml(@PathVariable UUID id,
|
||||
@PathVariable int pageNumber) {
|
||||
bookService.getById(id); // 404 if not found
|
||||
try {
|
||||
return ResponseEntity.ok(markdownStorageService.getText(id, pageNumber));
|
||||
} catch (Exception e) {
|
||||
return ResponseEntity.notFound().build();
|
||||
}
|
||||
}
|
||||
|
||||
@GetMapping("/{id}/figures")
|
||||
public ResponseEntity<List<FigureResponse>> figures(@PathVariable UUID id) {
|
||||
bookService.getById(id); // 404 if not found
|
||||
List<FigureResponse> responses = figureRepository.findAllByBookId(id)
|
||||
.stream()
|
||||
.map(f -> toFigureResponse(id, f))
|
||||
.toList();
|
||||
return ResponseEntity.ok(responses);
|
||||
}
|
||||
|
||||
private FigureResponse toFigureResponse(UUID bookId, FigureEntity f) {
|
||||
String filename = f.getImagePath().substring(f.getImagePath().lastIndexOf('/') + 1);
|
||||
String imageUrl = "/api/v1/figures/" + bookId + "/" + filename;
|
||||
return new FigureResponse(
|
||||
f.getId(), f.getLabel(), f.getCaption(),
|
||||
f.getFigureType().name(), f.getPage(), imageUrl,
|
||||
f.getSectionId(),
|
||||
null // section title not eagerly loaded here
|
||||
);
|
||||
}
|
||||
|
||||
private Map<String, Object> toSummaryResponse(Book book) {
|
||||
return Map.of(
|
||||
"id", book.getId(),
|
||||
|
||||
@@ -1,41 +1,82 @@
|
||||
package com.aiteacher.book;
|
||||
|
||||
import com.aiteacher.document.*;
|
||||
import com.aiteacher.figure.FigureStorageService;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.ai.document.Document;
|
||||
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
|
||||
import org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig;
|
||||
import org.springframework.ai.vectorstore.VectorStore;
|
||||
import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
|
||||
import org.springframework.core.io.FileSystemResource;
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.scheduling.annotation.Async;
|
||||
import org.springframework.stereotype.Service;
|
||||
import org.springframework.transaction.annotation.Transactional;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
import java.util.regex.Pattern;
|
||||
import java.time.Instant;
|
||||
import java.util.*;
|
||||
|
||||
@Service
|
||||
public class BookEmbeddingService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(BookEmbeddingService.class);
|
||||
|
||||
// Pattern to detect diagram/figure captions
|
||||
private static final Pattern CAPTION_PATTERN =
|
||||
Pattern.compile("^(Figure|Fig\\.|Table|Diagram)\\s+[\\d.]+", Pattern.CASE_INSENSITIVE);
|
||||
|
||||
private final VectorStore vectorStore;
|
||||
private final BookRepository bookRepository;
|
||||
private final MarkerPageParser markerPageParser;
|
||||
private final FigureExtractionService figureExtractionService;
|
||||
private final VisionDescriptionService visionDescriptionService;
|
||||
private final TextChunkingService textChunkingService;
|
||||
private final ChunkFigureRefService chunkFigureRefService;
|
||||
private final SectionRepository sectionRepository;
|
||||
private final ChapterRepository chapterRepository;
|
||||
private final FigureRepository figureRepository;
|
||||
private final ChunkFigureRefRepository chunkFigureRefRepository;
|
||||
private final FigureStorageService figureStorageService;
|
||||
private final MarkdownStorageService markdownStorageService;
|
||||
|
||||
public BookEmbeddingService(VectorStore vectorStore, BookRepository bookRepository) {
|
||||
@Value("${app.embedding.batch-size:50}")
|
||||
private int embeddingBatchSize;
|
||||
|
||||
@Value("${app.embedding.batch-delay-ms:1000}")
|
||||
private long embeddingBatchDelayMs;
|
||||
|
||||
@Value("${app.embedding.skip-embedding:false}")
|
||||
private boolean skipEmbedding;
|
||||
|
||||
public BookEmbeddingService(
|
||||
VectorStore vectorStore,
|
||||
BookRepository bookRepository,
|
||||
MarkerPageParser markerPageParser,
|
||||
FigureExtractionService figureExtractionService,
|
||||
VisionDescriptionService visionDescriptionService,
|
||||
TextChunkingService textChunkingService,
|
||||
ChunkFigureRefService chunkFigureRefService,
|
||||
SectionRepository sectionRepository,
|
||||
ChapterRepository chapterRepository,
|
||||
FigureRepository figureRepository,
|
||||
ChunkFigureRefRepository chunkFigureRefRepository,
|
||||
FigureStorageService figureStorageService,
|
||||
MarkdownStorageService markdownStorageService) {
|
||||
this.vectorStore = vectorStore;
|
||||
this.bookRepository = bookRepository;
|
||||
this.markerPageParser = markerPageParser;
|
||||
this.figureExtractionService = figureExtractionService;
|
||||
this.visionDescriptionService = visionDescriptionService;
|
||||
this.textChunkingService = textChunkingService;
|
||||
this.chunkFigureRefService = chunkFigureRefService;
|
||||
this.sectionRepository = sectionRepository;
|
||||
this.chapterRepository = chapterRepository;
|
||||
this.figureRepository = figureRepository;
|
||||
this.chunkFigureRefRepository = chunkFigureRefRepository;
|
||||
this.figureStorageService = figureStorageService;
|
||||
this.markdownStorageService = markdownStorageService;
|
||||
}
|
||||
|
||||
@Async
|
||||
public void embedBook(UUID bookId, String bookTitle, Path pdfPath) {
|
||||
log.info("Starting embedding for book {} ({})", bookId, bookTitle);
|
||||
log.info("Starting Marker-powered embedding for book {} ({})", bookId, bookTitle);
|
||||
|
||||
Book book = bookRepository.findById(bookId).orElse(null);
|
||||
if (book == null) {
|
||||
@@ -47,29 +88,94 @@ public class BookEmbeddingService {
|
||||
book.setStatus(BookStatus.PROCESSING);
|
||||
bookRepository.save(book);
|
||||
|
||||
PagePdfDocumentReader reader = new PagePdfDocumentReader(
|
||||
new FileSystemResource(pdfPath.toFile()),
|
||||
PdfDocumentReaderConfig.builder()
|
||||
.withPagesPerDocument(1)
|
||||
.build()
|
||||
);
|
||||
String chapterId = bookId + "-ch1";
|
||||
ChapterEntity chapter = new ChapterEntity(chapterId, bookId, 1, bookTitle, 1);
|
||||
chapterRepository.save(chapter);
|
||||
|
||||
List<Document> pages = reader.get();
|
||||
int pageCount = pages.size();
|
||||
// Step 1: Parse with Marker — split into 100-page chunks, then merge results
|
||||
ParsedBook parsed = markerPageParser.parse(pdfPath);
|
||||
|
||||
// Enrich metadata and tag diagram captions
|
||||
List<Document> enriched = pages.stream()
|
||||
.map(doc -> enrichDocument(doc, bookId.toString(), bookTitle))
|
||||
List<PageResult> pageResults = parsed.pages();
|
||||
|
||||
// Step 2: Build SectionEntity per page and persist
|
||||
List<SectionEntity> sections = buildAndSaveSections(bookId, bookTitle, chapterId, pageResults);
|
||||
|
||||
// Step 3: Chunk and embed text
|
||||
List<Document> allChunks = new ArrayList<>();
|
||||
for (SectionEntity section : sections) {
|
||||
allChunks.addAll(textChunkingService.chunk(section, bookTitle));
|
||||
}
|
||||
if (skipEmbedding) {
|
||||
log.info("skip-embedding=true — skipping text embedding for book {}", bookId);
|
||||
} else {
|
||||
embedInBatches(allChunks, bookId);
|
||||
log.info("Embedded {} text chunks for book {}", allChunks.size(), bookId);
|
||||
}
|
||||
|
||||
// Step 4: Decode pre-cropped figures from Marker output
|
||||
FigureExtractionService.ExtractionResult extraction =
|
||||
figureExtractionService.extract(bookId, chapterId, pageResults);
|
||||
List<FigureEntity> figures = extraction.figures();
|
||||
|
||||
// Step 4b: Save per-page HTML to S3, replacing Marker image src with API URLs
|
||||
parsed.htmlByPage().forEach((pageNumber, html) -> {
|
||||
String resolved = resolveImageSrcs(html, bookId, extraction.blockIdToFigureId());
|
||||
markdownStorageService.save(bookId, pageNumber, resolved);
|
||||
});
|
||||
log.info("Saved {} HTML pages to S3 for book {}", parsed.htmlByPage().size(), bookId);
|
||||
|
||||
// Step 5: Vision analysis (description + visible text) → embed figure chunks
|
||||
Map<String, SectionEntity> sectionById = new HashMap<>();
|
||||
for (SectionEntity s : sections) sectionById.put(s.getId(), s);
|
||||
|
||||
for (FigureEntity figure : figures) {
|
||||
// Prefer caption extracted from the linked section's full text
|
||||
if (figure.getCaption() == null || figure.getCaption().isBlank()) {
|
||||
String sectionCaption = extractCaptionFromSection(sectionById.get(figure.getSectionId()));
|
||||
if (sectionCaption != null) {
|
||||
figure.setCaption(sectionCaption);
|
||||
figureRepository.save(figure);
|
||||
} else {
|
||||
byte[] imageBytes = figureStorageService.getBytes(figure.getImagePath());
|
||||
VisionDescriptionService.ImageAnalysis analysis =
|
||||
visionDescriptionService.analyze(imageBytes, figure.getCaption());
|
||||
figure.setCaption(analysis.description());
|
||||
figureRepository.save(figure);
|
||||
}
|
||||
}
|
||||
|
||||
// Embedding content: description
|
||||
String embeddingContent = (figure.getCaption() != null ? "\n" + figure.getCaption() : "");
|
||||
|
||||
String embeddingId = UUID.randomUUID().toString();
|
||||
if (!skipEmbedding) {
|
||||
Document figureDoc = new Document(embeddingId, embeddingContent,
|
||||
buildFigureMetadata(figure, bookTitle, embeddingId, ""));
|
||||
vectorStore.add(List.of(figureDoc));
|
||||
figure.setCaptionEmbeddingId(UUID.fromString(embeddingId));
|
||||
}
|
||||
figureRepository.save(figure);
|
||||
}
|
||||
log.info("Embedded {} figure chunks for book {}", figures.size(), bookId);
|
||||
|
||||
// Step 6: Link text chunks to figures via in-text references
|
||||
for (SectionEntity section : sections) {
|
||||
List<Document> sectionChunks = allChunks.stream()
|
||||
.filter(d -> section.getId().equals(d.getMetadata().get("section_id")))
|
||||
.toList();
|
||||
|
||||
vectorStore.add(enriched);
|
||||
List<FigureEntity> sectionFigures = figures.stream()
|
||||
.filter(f -> section.getId().equals(f.getSectionId()))
|
||||
.toList();
|
||||
chunkFigureRefService.linkChunksToFigures(sectionChunks, sectionFigures, section.getPageStart());
|
||||
}
|
||||
|
||||
book.setStatus(BookStatus.READY);
|
||||
book.setPageCount(pageCount);
|
||||
book.setProcessedAt(java.time.Instant.now());
|
||||
book.setPageCount(parsed.htmlByPage().size());
|
||||
book.setProcessedAt(Instant.now());
|
||||
bookRepository.save(book);
|
||||
|
||||
log.info("Finished embedding book {} — {} pages", bookId, pageCount);
|
||||
log.info("Finished embedding book {} — {} pages, {} figures",
|
||||
bookId, sections.size(), figures.size());
|
||||
|
||||
} catch (Exception ex) {
|
||||
log.error("Failed to embed book {}", bookId, ex);
|
||||
@@ -79,40 +185,112 @@ public class BookEmbeddingService {
|
||||
}
|
||||
}
|
||||
|
||||
private Document enrichDocument(Document doc, String bookId, String bookTitle) {
|
||||
String content = doc.getText();
|
||||
String chunkType = detectChunkType(content);
|
||||
|
||||
doc.getMetadata().put("book_id", bookId);
|
||||
doc.getMetadata().put("book_title", bookTitle);
|
||||
doc.getMetadata().put("chunk_type", chunkType);
|
||||
|
||||
return doc;
|
||||
}
|
||||
|
||||
private String detectChunkType(String content) {
|
||||
if (content != null) {
|
||||
for (String line : content.split("\\r?\\n")) {
|
||||
if (CAPTION_PATTERN.matcher(line.trim()).find()) {
|
||||
return "diagram";
|
||||
}
|
||||
}
|
||||
}
|
||||
return "text";
|
||||
}
|
||||
|
||||
@Transactional
|
||||
public void deleteBookChunks(UUID bookId) {
|
||||
log.info("Deleting vector chunks for book {}", bookId);
|
||||
log.info("Deleting all data for book {}", bookId);
|
||||
try {
|
||||
List<String> figureIds = figureRepository.findAllByBookId(bookId)
|
||||
.stream().map(FigureEntity::getId).toList();
|
||||
if (!figureIds.isEmpty()) {
|
||||
chunkFigureRefRepository.deleteByFigureIdIn(figureIds);
|
||||
}
|
||||
figureRepository.deleteAllByBookId(bookId);
|
||||
figureStorageService.deleteAll(bookId);
|
||||
markdownStorageService.deleteAll(bookId);
|
||||
sectionRepository.deleteAllByBookId(bookId);
|
||||
chapterRepository.deleteAllByBookId(bookId);
|
||||
|
||||
FilterExpressionBuilder b = new FilterExpressionBuilder();
|
||||
vectorStore.delete(b.eq("book_id", bookId.toString()).build());
|
||||
} catch (Exception ex) {
|
||||
log.warn("Could not delete vector chunks for book {}: {}", bookId, ex.getMessage());
|
||||
log.warn("Error during cleanup for book {}: {}", bookId, ex.getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
private String truncate(String message, int maxLength) {
|
||||
if (message == null) return null;
|
||||
return message.length() <= maxLength ? message : message.substring(0, maxLength);
|
||||
// --- Private helpers ---
|
||||
|
||||
private List<SectionEntity> buildAndSaveSections(UUID bookId, String bookTitle,
|
||||
String chapterId,
|
||||
List<PageResult> pageResults) {
|
||||
List<SectionEntity> sections = new ArrayList<>();
|
||||
for (PageResult page : pageResults) {
|
||||
if (page.orderedText().isBlank()) continue;
|
||||
|
||||
String sectionId = bookId + "-p" + page.pageNumber();
|
||||
String title = truncate(page.headingTitle() != null ? page.headingTitle() : "Page " + page.pageNumber(), 500);
|
||||
|
||||
SectionEntity section = new SectionEntity(
|
||||
sectionId, chapterId, bookId,
|
||||
String.valueOf(page.pageNumber()),
|
||||
title,
|
||||
page.pageNumber(), page.pageNumber(),
|
||||
page.orderedText());
|
||||
sections.add(sectionRepository.save(section));
|
||||
}
|
||||
return sections;
|
||||
}
|
||||
|
||||
private void embedInBatches(List<Document> docs, UUID bookId) {
|
||||
int total = docs.size();
|
||||
for (int i = 0; i < total; i += embeddingBatchSize) {
|
||||
List<Document> batch = docs.subList(i, Math.min(i + embeddingBatchSize, total));
|
||||
vectorStore.add(batch);
|
||||
log.debug("Embedded batch {}/{} for book {}",
|
||||
i / embeddingBatchSize + 1, (total - 1) / embeddingBatchSize + 1, bookId);
|
||||
if (i + embeddingBatchSize < total) {
|
||||
try { Thread.sleep(embeddingBatchDelayMs); }
|
||||
catch (InterruptedException e) { Thread.currentThread().interrupt(); }
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private Map<String, Object> buildFigureMetadata(FigureEntity figure, String bookTitle,
|
||||
String embeddingId, String imageText) {
|
||||
Map<String, Object> m = new HashMap<>();
|
||||
m.put("type", "FIGURE");
|
||||
m.put("book_id", figure.getBookId().toString());
|
||||
m.put("book_title", bookTitle);
|
||||
m.put("chapter_id", figure.getChapterId() != null ? figure.getChapterId() : "");
|
||||
m.put("section_id", figure.getSectionId() != null ? figure.getSectionId() : "");
|
||||
m.put("figure_id", figure.getId());
|
||||
m.put("figure_type", figure.getFigureType().name());
|
||||
m.put("image_path", figure.getImagePath());
|
||||
m.put("label", figure.getLabel() != null ? figure.getLabel() : "");
|
||||
m.put("page", figure.getPage());
|
||||
m.put("embedding_id", embeddingId);
|
||||
m.put("image_text", imageText); // verbatim text visible inside the image
|
||||
return m;
|
||||
}
|
||||
|
||||
/**
|
||||
* Replaces Marker's {@code src='{blockId}'} image attributes with resolved API URLs.
|
||||
* Block IDs look like {@code /page/0/Figure/2}.
|
||||
*/
|
||||
private String resolveImageSrcs(String html, UUID bookId, Map<String, String> blockIdToFigureId) {
|
||||
for (Map.Entry<String, String> entry : blockIdToFigureId.entrySet()) {
|
||||
String blockId = entry.getKey();
|
||||
String figureId = entry.getValue();
|
||||
String apiUrl = "/api/v1/figures/" + bookId + "/" + figureId + ".png";
|
||||
// Marker emits both single and double-quoted src attributes
|
||||
html = html.replace("src='" + blockId + "'", "src='" + apiUrl + "'");
|
||||
html = html.replace("src=\"" + blockId + "\"", "src=\"" + apiUrl + "\"");
|
||||
}
|
||||
return html;
|
||||
}
|
||||
|
||||
private String extractCaptionFromSection(SectionEntity section) {
|
||||
if (section == null) return null;
|
||||
for (String line : section.getFullText().split("\n")) {
|
||||
String trimmed = line.strip();
|
||||
if (trimmed.startsWith("Fig.") || trimmed.startsWith("Figure") || trimmed.startsWith("Algorithm")) {
|
||||
return trimmed;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
private String truncate(String msg, int max) {
|
||||
if (msg == null) return null;
|
||||
return msg.length() <= max ? msg : msg.substring(0, max);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,11 +1,13 @@
|
||||
package com.aiteacher.book;
|
||||
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.stereotype.Service;
|
||||
import org.springframework.web.multipart.MultipartFile;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.Paths;
|
||||
import java.util.List;
|
||||
import java.util.NoSuchElementException;
|
||||
import java.util.UUID;
|
||||
@@ -15,10 +17,15 @@ public class BookService {
|
||||
|
||||
private final BookRepository bookRepository;
|
||||
private final BookEmbeddingService bookEmbeddingService;
|
||||
private final Path bookStoragePath;
|
||||
|
||||
public BookService(BookRepository bookRepository, BookEmbeddingService bookEmbeddingService) {
|
||||
public BookService(
|
||||
BookRepository bookRepository,
|
||||
BookEmbeddingService bookEmbeddingService,
|
||||
@Value("${app.figure-storage.base-path:./uploads}") String basePath) {
|
||||
this.bookRepository = bookRepository;
|
||||
this.bookEmbeddingService = bookEmbeddingService;
|
||||
this.bookStoragePath = Paths.get(basePath).toAbsolutePath().normalize().resolve("books");
|
||||
}
|
||||
|
||||
public Book upload(MultipartFile file) throws IOException {
|
||||
@@ -28,20 +35,35 @@ public class BookService {
|
||||
}
|
||||
|
||||
String title = deriveTitle(originalFilename);
|
||||
|
||||
Book book = new Book(title, originalFilename, file.getSize());
|
||||
book = bookRepository.save(book);
|
||||
|
||||
// Write to a temp file so the async task can read it
|
||||
Path tempFile = Files.createTempFile("aiteacher-", "-" + book.getId() + ".pdf");
|
||||
file.transferTo(tempFile.toFile());
|
||||
// Persist PDF in a stable location for potential re-embedding
|
||||
Files.createDirectories(bookStoragePath);
|
||||
Path pdfPath = bookStoragePath.resolve(book.getId() + ".pdf");
|
||||
file.transferTo(pdfPath.toFile());
|
||||
|
||||
UUID bookId = book.getId();
|
||||
Path pdfPath = tempFile;
|
||||
String bookTitle = title;
|
||||
bookEmbeddingService.embedBook(bookId, title, pdfPath);
|
||||
return book;
|
||||
}
|
||||
|
||||
bookEmbeddingService.embedBook(bookId, bookTitle, pdfPath);
|
||||
public Book reembed(UUID id) {
|
||||
Book book = bookRepository.findById(id)
|
||||
.orElseThrow(() -> new NoSuchElementException("Book not found."));
|
||||
|
||||
if (book.getStatus() == BookStatus.PROCESSING) {
|
||||
throw new IllegalStateException("Book is already being processed.");
|
||||
}
|
||||
|
||||
Path pdfPath = bookStoragePath.resolve(id + ".pdf");
|
||||
if (!Files.exists(pdfPath)) {
|
||||
throw new IllegalStateException(
|
||||
"Original PDF not found. Please re-upload the book before re-embedding.");
|
||||
}
|
||||
|
||||
bookEmbeddingService.deleteBookChunks(id);
|
||||
bookEmbeddingService.embedBook(id, book.getTitle(), pdfPath);
|
||||
return book;
|
||||
}
|
||||
|
||||
@@ -63,14 +85,21 @@ public class BookService {
|
||||
}
|
||||
|
||||
bookEmbeddingService.deleteBookChunks(id);
|
||||
|
||||
// Delete the stored PDF
|
||||
Path pdfPath = bookStoragePath.resolve(id + ".pdf");
|
||||
try {
|
||||
Files.deleteIfExists(pdfPath);
|
||||
} catch (IOException ex) {
|
||||
// Non-fatal — log only
|
||||
}
|
||||
|
||||
bookRepository.deleteById(id);
|
||||
}
|
||||
|
||||
private String deriveTitle(String filename) {
|
||||
// Strip .pdf extension and replace separators with spaces
|
||||
String name = filename.replaceAll("(?i)\\.pdf$", "");
|
||||
name = name.replaceAll("[-_]", " ");
|
||||
// Capitalise first letter
|
||||
if (!name.isEmpty()) {
|
||||
name = Character.toUpperCase(name.charAt(0)) + name.substring(1);
|
||||
}
|
||||
|
||||
@@ -0,0 +1,12 @@
|
||||
package com.aiteacher.book;
|
||||
|
||||
public record FigureResponse(
|
||||
String figureId,
|
||||
String label,
|
||||
String caption,
|
||||
String figureType,
|
||||
int page,
|
||||
String imageUrl,
|
||||
String sectionId,
|
||||
String sectionTitle
|
||||
) {}
|
||||
@@ -3,28 +3,21 @@ package com.aiteacher.chat;
|
||||
import com.aiteacher.book.BookRepository;
|
||||
import com.aiteacher.book.BookStatus;
|
||||
import com.aiteacher.book.NoKnowledgeSourceException;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import com.aiteacher.document.FigureEntity;
|
||||
import com.aiteacher.document.SectionEntity;
|
||||
import com.aiteacher.retrieval.CitationValidatorService;
|
||||
import com.aiteacher.retrieval.LabelledContext;
|
||||
import com.aiteacher.retrieval.NeurosurgeryRetriever;
|
||||
import com.aiteacher.retrieval.QueryExpansionService;
|
||||
import com.aiteacher.retrieval.RetrievalResult;
|
||||
import org.springframework.ai.chat.client.ChatClient;
|
||||
import org.springframework.ai.chat.client.advisor.vectorstore.QuestionAnswerAdvisor;
|
||||
import org.springframework.ai.chat.model.ChatResponse;
|
||||
import org.springframework.ai.document.Document;
|
||||
import org.springframework.ai.vectorstore.SearchRequest;
|
||||
import org.springframework.ai.vectorstore.VectorStore;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import java.util.ArrayList;
|
||||
import java.util.HashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.NoSuchElementException;
|
||||
import java.util.UUID;
|
||||
import java.util.*;
|
||||
|
||||
@Service
|
||||
public class ChatService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(ChatService.class);
|
||||
|
||||
private static final String SYSTEM_PROMPT = """
|
||||
You are an expert neurosurgery educator assistant. Answer questions using the
|
||||
medical textbook content provided to you as context.
|
||||
@@ -35,26 +28,34 @@ public class ChatService {
|
||||
- Build answers from what is present: procedures, conditions, techniques, and descriptions all contribute; combine them into a rich, structured response
|
||||
- Use clear structure: headings, bullet points, or numbered steps where appropriate to maximize clarity
|
||||
- Only say you cannot answer if the context is entirely unrelated to the question
|
||||
- Cite sources for each major point (book title and page number from the context metadata)
|
||||
- Cite sources for each major claim using the reference labels from the context (e.g. [S1], [F2]). Prefer these labels over inventing page numbers, but you may also describe the source naturally if needed.
|
||||
- When referencing diagrams or figures, prefer their label from the context (e.g. [F1])
|
||||
- Maintain continuity with the conversation history
|
||||
- Never fabricate clinical information not present in the context
|
||||
""";
|
||||
|
||||
private final ChatClient chatClient;
|
||||
private final VectorStore vectorStore;
|
||||
private final BookRepository bookRepository;
|
||||
private final ChatSessionRepository sessionRepository;
|
||||
private final MessageRepository messageRepository;
|
||||
private final NeurosurgeryRetriever retriever;
|
||||
private final QueryExpansionService queryExpansionService;
|
||||
private final CitationValidatorService citationValidatorService;
|
||||
|
||||
public ChatService(ChatClient chatClient, VectorStore vectorStore,
|
||||
public ChatService(ChatClient chatClient,
|
||||
BookRepository bookRepository,
|
||||
ChatSessionRepository sessionRepository,
|
||||
MessageRepository messageRepository) {
|
||||
MessageRepository messageRepository,
|
||||
NeurosurgeryRetriever retriever,
|
||||
QueryExpansionService queryExpansionService,
|
||||
CitationValidatorService citationValidatorService) {
|
||||
this.chatClient = chatClient;
|
||||
this.vectorStore = vectorStore;
|
||||
this.bookRepository = bookRepository;
|
||||
this.sessionRepository = sessionRepository;
|
||||
this.messageRepository = messageRepository;
|
||||
this.retriever = retriever;
|
||||
this.queryExpansionService = queryExpansionService;
|
||||
this.citationValidatorService = citationValidatorService;
|
||||
}
|
||||
|
||||
public ChatSession createSession(String topicId) {
|
||||
@@ -73,7 +74,11 @@ public class ChatService {
|
||||
ChatSession session = sessionRepository.findById(sessionId)
|
||||
.orElseThrow(() -> new NoSuchElementException("Session not found."));
|
||||
|
||||
if (!bookRepository.existsByStatus(BookStatus.READY)) {
|
||||
List<com.aiteacher.book.Book> readyBooks = bookRepository.findAll().stream()
|
||||
.filter(b -> b.getStatus() == BookStatus.READY)
|
||||
.toList();
|
||||
|
||||
if (readyBooks.isEmpty()) {
|
||||
throw new NoKnowledgeSourceException("No books are available as knowledge sources.");
|
||||
}
|
||||
|
||||
@@ -81,27 +86,40 @@ public class ChatService {
|
||||
Message userMessage = new Message(sessionId, MessageRole.USER, userContent);
|
||||
messageRepository.save(userMessage);
|
||||
|
||||
// Build conversation history for context
|
||||
// Build full question with conversation history
|
||||
List<Message> history = messageRepository.findBySessionIdOrderByCreatedAtAsc(sessionId);
|
||||
|
||||
// Build the prompt with full conversation history as context
|
||||
String fullQuestion = buildQuestionWithHistory(history, userContent, session.getTopicId());
|
||||
|
||||
var qaAdvisor = QuestionAnswerAdvisor.builder(vectorStore)
|
||||
.searchRequest(SearchRequest.builder().similarityThreshold(0.5d).topK(6).build())
|
||||
.build();
|
||||
// Expand only the current user question to clinical terminology for retrieval (US1).
|
||||
// fullQuestion (which includes conversation history) is used for the LLM context prompt,
|
||||
// but retrieval should be driven by a concise clinical rewrite of the actual question.
|
||||
String retrievalQuery = queryExpansionService.expand(userContent).rewritten();
|
||||
|
||||
ChatResponse response = chatClient.prompt()
|
||||
.advisors(qaAdvisor)
|
||||
// Retrieve context from all ready books using the expanded query
|
||||
List<SectionEntity> allSections = new ArrayList<>();
|
||||
List<FigureEntity> allFigures = new ArrayList<>();
|
||||
for (com.aiteacher.book.Book book : readyBooks) {
|
||||
RetrievalResult result = retriever.retrieve(retrievalQuery, book.getId());
|
||||
allSections.addAll(result.parentSections());
|
||||
allFigures.addAll(result.figures());
|
||||
}
|
||||
|
||||
// Build labelled context prompt (US2): assigns [S1]/[F1] labels to each source
|
||||
LabelledContext ctx = buildContextPrompt(fullQuestion, allSections, allFigures);
|
||||
|
||||
// Generate answer
|
||||
String rawContent = chatClient.prompt()
|
||||
.system(SYSTEM_PROMPT)
|
||||
.user(fullQuestion)
|
||||
.user(ctx.promptText())
|
||||
.call()
|
||||
.chatResponse();
|
||||
.content();
|
||||
|
||||
String assistantContent = response.getResult().getOutput().getText();
|
||||
List<Map<String, Object>> sources = extractSources(response);
|
||||
// Strip any citation labels not present in the retrieved context (US2)
|
||||
String assistantContent = citationValidatorService.validate(rawContent, ctx.allLabels());
|
||||
|
||||
// Attach sources with their ref-labels for frontend traceability
|
||||
List<Map<String, Object>> sources = buildSources(allSections, allFigures);
|
||||
|
||||
// Persist assistant message
|
||||
Message assistantMessage = new Message(sessionId, MessageRole.ASSISTANT, assistantContent);
|
||||
assistantMessage.setSources(sources);
|
||||
return messageRepository.save(assistantMessage);
|
||||
@@ -118,24 +136,114 @@ public class ChatService {
|
||||
sessionRepository.deleteById(sessionId);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Private helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Builds the LLM context prompt, tagging each section as [S1], [S2]… and
|
||||
* each figure as [F1], [F2]… so the model can cite only known sources.
|
||||
*/
|
||||
private LabelledContext buildContextPrompt(String question,
|
||||
List<SectionEntity> sections,
|
||||
List<FigureEntity> figures) {
|
||||
Map<String, SectionEntity> sectionLabels = new LinkedHashMap<>();
|
||||
Map<String, FigureEntity> figureLabels = new LinkedHashMap<>();
|
||||
StringBuilder sb = new StringBuilder();
|
||||
|
||||
if (!sections.isEmpty()) {
|
||||
sb.append("CONTEXT:\n\n");
|
||||
for (int i = 0; i < sections.size(); i++) {
|
||||
SectionEntity section = sections.get(i);
|
||||
String label = "S" + (i + 1);
|
||||
sectionLabels.put(label, section);
|
||||
sb.append("[").append(label).append("] ")
|
||||
.append(section.getTitle())
|
||||
.append(", p.").append(section.getPageStart()).append("\n");
|
||||
sb.append(section.getFullText()).append("\n\n");
|
||||
}
|
||||
}
|
||||
|
||||
if (!figures.isEmpty()) {
|
||||
sb.append("AVAILABLE FIGURES:\n");
|
||||
for (int i = 0; i < figures.size(); i++) {
|
||||
FigureEntity figure = figures.get(i);
|
||||
String label = "F" + (i + 1);
|
||||
figureLabels.put(label, figure);
|
||||
sb.append("[").append(label).append("] ")
|
||||
.append(figure.getLabel() != null ? figure.getLabel() : "Figure")
|
||||
.append(" (p.").append(figure.getPage()).append("): ")
|
||||
.append(figure.getCaption() != null ? figure.getCaption() : "")
|
||||
.append("\n");
|
||||
}
|
||||
sb.append("\nWhen referencing diagrams, use their label from the context (e.g. [F1]).\n\n");
|
||||
}
|
||||
|
||||
sb.append("QUESTION:\n").append(question);
|
||||
return new LabelledContext(sectionLabels, figureLabels, sb.toString());
|
||||
}
|
||||
|
||||
private List<Map<String, Object>> buildSources(List<SectionEntity> sections,
|
||||
List<FigureEntity> figures) {
|
||||
List<Map<String, Object>> sources = new ArrayList<>();
|
||||
|
||||
for (int i = 0; i < sections.size(); i++) {
|
||||
SectionEntity section = sections.get(i);
|
||||
Map<String, Object> source = new LinkedHashMap<>();
|
||||
source.put("type", "TEXT");
|
||||
source.put("refLabel", "S" + (i + 1));
|
||||
source.put("bookId", section.getBookId());
|
||||
source.put("bookTitle", deriveTitleFromSection(section));
|
||||
source.put("page", section.getPageStart());
|
||||
source.put("chunkText", truncate(section.getFullText(), 500));
|
||||
sources.add(source);
|
||||
}
|
||||
|
||||
for (int i = 0; i < figures.size(); i++) {
|
||||
FigureEntity figure = figures.get(i);
|
||||
Map<String, Object> source = new LinkedHashMap<>();
|
||||
source.put("type", "FIGURE");
|
||||
source.put("refLabel", "F" + (i + 1));
|
||||
source.put("bookId", figure.getBookId());
|
||||
source.put("bookTitle", bookRepository.findById(figure.getBookId())
|
||||
.map(com.aiteacher.book.Book::getTitle).orElse("Book"));
|
||||
source.put("page", figure.getPage());
|
||||
source.put("figureId", figure.getId());
|
||||
source.put("label", figure.getLabel() != null ? figure.getLabel() : "");
|
||||
source.put("caption", figure.getCaption() != null ? figure.getCaption() : "");
|
||||
source.put("figureType", figure.getFigureType().name());
|
||||
String filename = figure.getImagePath().substring(
|
||||
figure.getImagePath().lastIndexOf('/') + 1);
|
||||
source.put("imageUrl", "/api/v1/figures/" + figure.getBookId() + "/" + filename);
|
||||
sources.add(source);
|
||||
}
|
||||
|
||||
return sources;
|
||||
}
|
||||
|
||||
private String deriveTitleFromSection(SectionEntity section) {
|
||||
if (section == null) return "Book";
|
||||
return bookRepository.findById(section.getBookId())
|
||||
.map(com.aiteacher.book.Book::getTitle)
|
||||
.orElse("Book");
|
||||
}
|
||||
|
||||
private String buildQuestionWithHistory(List<Message> history, String currentQuestion,
|
||||
String topicId) {
|
||||
boolean hasTopic = topicId != null && !topicId.equals("free-form");
|
||||
|
||||
if (history.size() <= 1) {
|
||||
return hasTopic
|
||||
? String.format("[Context: This is a question about the neurosurgery topic '%s']\n%s",
|
||||
? String.format("[Context: question about neurosurgery topic '%s']\n%s",
|
||||
topicId, currentQuestion)
|
||||
: currentQuestion;
|
||||
}
|
||||
|
||||
StringBuilder sb = new StringBuilder();
|
||||
if (hasTopic) {
|
||||
sb.append(String.format("[Context: This conversation is about the neurosurgery topic '%s']\n\n",
|
||||
topicId));
|
||||
sb.append(String.format("[Context: conversation about '%s']\n\n", topicId));
|
||||
}
|
||||
sb.append("Previous conversation:\n");
|
||||
// Include all messages except the last (which is the current user message just saved)
|
||||
for (int i = 0; i < history.size() - 1; i++) {
|
||||
Message msg = history.get(i);
|
||||
sb.append(msg.getRole().name()).append(": ").append(msg.getContent()).append("\n");
|
||||
@@ -144,30 +252,8 @@ public class ChatService {
|
||||
return sb.toString();
|
||||
}
|
||||
|
||||
private List<Map<String, Object>> extractSources(ChatResponse response) {
|
||||
List<Map<String, Object>> sources = new ArrayList<>();
|
||||
|
||||
if (response.getMetadata() != null) {
|
||||
Object retrieved = response.getMetadata().get(QuestionAnswerAdvisor.RETRIEVED_DOCUMENTS);
|
||||
if (retrieved instanceof List<?> docs) {
|
||||
for (Object docObj : docs) {
|
||||
if (docObj instanceof Document doc) {
|
||||
Map<String, Object> metadata = doc.getMetadata();
|
||||
String bookTitle = (String) metadata.get("book_title");
|
||||
Object pageObj = metadata.get("page_number");
|
||||
Integer page = pageObj instanceof Number n ? n.intValue() : null;
|
||||
if (bookTitle != null) {
|
||||
Map<String, Object> source = new HashMap<>();
|
||||
source.put("bookTitle", bookTitle);
|
||||
source.put("page", page);
|
||||
source.put("chunkText", doc.getText());
|
||||
sources.add(source);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return sources;
|
||||
private String truncate(String text, int maxChars) {
|
||||
if (text == null) return "";
|
||||
return text.length() <= maxChars ? text : text.substring(0, maxChars) + "…";
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,37 @@
|
||||
package com.aiteacher.config;
|
||||
|
||||
import com.aiteacher.figure.FigureStorageService;
|
||||
import org.springframework.http.HttpStatus;
|
||||
import org.springframework.web.bind.annotation.*;
|
||||
import org.springframework.web.server.ResponseStatusException;
|
||||
|
||||
import jakarta.servlet.http.HttpServletResponse;
|
||||
import java.io.IOException;
|
||||
|
||||
/**
|
||||
* Serves figure images by redirecting to a presigned S3 URL.
|
||||
* The key stored in DB is the full S3 object key, e.g. "figures/{bookId}/{figureId}.png".
|
||||
*/
|
||||
@RestController
|
||||
@RequestMapping("/api/v1/figures")
|
||||
public class FigureStorageConfig {
|
||||
|
||||
private final FigureStorageService figureStorageService;
|
||||
|
||||
public FigureStorageConfig(FigureStorageService figureStorageService) {
|
||||
this.figureStorageService = figureStorageService;
|
||||
}
|
||||
|
||||
@GetMapping("/{bookId}/{filename}")
|
||||
public void serve(@PathVariable String bookId,
|
||||
@PathVariable String filename,
|
||||
HttpServletResponse response) throws IOException {
|
||||
String key = "figures/" + bookId + "/" + filename;
|
||||
try {
|
||||
String url = figureStorageService.presignedUrl(key);
|
||||
response.sendRedirect(url);
|
||||
} catch (Exception ex) {
|
||||
throw new ResponseStatusException(HttpStatus.NOT_FOUND, "Figure not found: " + key);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,30 @@
|
||||
package com.aiteacher.config;
|
||||
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.context.annotation.Bean;
|
||||
import org.springframework.context.annotation.Configuration;
|
||||
import org.springframework.http.client.JdkClientHttpRequestFactory;
|
||||
import org.springframework.web.client.RestClient;
|
||||
|
||||
import java.net.http.HttpClient;
|
||||
|
||||
@Configuration
|
||||
public class MarkerConfig {
|
||||
|
||||
@Value("${app.marker.base-url:http://localhost:8000}")
|
||||
private String markerBaseUrl;
|
||||
|
||||
@Bean
|
||||
RestClient markerRestClient() {
|
||||
// Use the JDK HTTP client with no timeout — Marker conversions can take several minutes.
|
||||
HttpClient httpClient = HttpClient.newBuilder()
|
||||
.build();
|
||||
JdkClientHttpRequestFactory factory = new JdkClientHttpRequestFactory(httpClient);
|
||||
// No read timeout set: JDK HTTP client defaults to no deadline.
|
||||
|
||||
return RestClient.builder()
|
||||
.baseUrl(markerBaseUrl)
|
||||
.requestFactory(factory)
|
||||
.build();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,76 @@
|
||||
package com.aiteacher.config;
|
||||
|
||||
import org.springframework.aot.hint.MemberCategory;
|
||||
import org.springframework.aot.hint.RuntimeHints;
|
||||
import org.springframework.aot.hint.RuntimeHintsRegistrar;
|
||||
import org.springframework.aot.hint.TypeReference;
|
||||
|
||||
/**
|
||||
* GraalVM native-image runtime hints for third-party libraries that use reflection
|
||||
* or classpath resource scanning not covered by Spring Boot's AOT processor.
|
||||
*
|
||||
* Registered via @ImportRuntimeHints on AiTeacherApplication.
|
||||
*/
|
||||
public class NativeHintsConfig implements RuntimeHintsRegistrar {
|
||||
|
||||
@Override
|
||||
public void registerHints(RuntimeHints hints, ClassLoader classLoader) {
|
||||
// PDFBox — font and encoding resources loaded via classpath scanning at runtime
|
||||
hints.resources().registerPattern("org/apache/pdfbox/resources/*");
|
||||
hints.resources().registerPattern("org/apache/pdfbox/resources/afm/*");
|
||||
hints.resources().registerPattern("org/apache/pdfbox/resources/cmap/*");
|
||||
hints.resources().registerPattern("org/apache/pdfbox/resources/glyphlist/*");
|
||||
hints.resources().registerPattern("org/apache/pdfbox/resources/icc/*");
|
||||
hints.resources().registerPattern("org/apache/pdfbox/resources/ttf/*");
|
||||
hints.resources().registerPattern("org/apache/pdfbox/resources/version.properties");
|
||||
|
||||
// PDFBox — font encoding classes instantiated via reflection
|
||||
hints.reflection().registerType(
|
||||
org.apache.pdfbox.pdmodel.font.encoding.GlyphList.class,
|
||||
MemberCategory.INVOKE_PUBLIC_CONSTRUCTORS,
|
||||
MemberCategory.INVOKE_PUBLIC_METHODS
|
||||
);
|
||||
hints.reflection().registerType(
|
||||
org.apache.pdfbox.pdmodel.font.encoding.WinAnsiEncoding.class,
|
||||
MemberCategory.INVOKE_PUBLIC_CONSTRUCTORS
|
||||
);
|
||||
hints.reflection().registerType(
|
||||
org.apache.pdfbox.pdmodel.font.encoding.MacRomanEncoding.class,
|
||||
MemberCategory.INVOKE_PUBLIC_CONSTRUCTORS
|
||||
);
|
||||
hints.reflection().registerType(
|
||||
org.apache.pdfbox.pdmodel.font.encoding.MacExpertEncoding.class,
|
||||
MemberCategory.INVOKE_PUBLIC_CONSTRUCTORS
|
||||
);
|
||||
hints.reflection().registerType(
|
||||
org.apache.pdfbox.pdmodel.font.encoding.StandardEncoding.class,
|
||||
MemberCategory.INVOKE_PUBLIC_CONSTRUCTORS
|
||||
);
|
||||
|
||||
// JPA / Hibernate — array types used in entity mappings
|
||||
hints.reflection().registerType(java.util.UUID[].class, MemberCategory.INVOKE_PUBLIC_CONSTRUCTORS);
|
||||
|
||||
// JBoss Logging — message logger implementations generated by annotation processor.
|
||||
// JBoss Logging uses reflection to look up the generated *_$logger class by name.
|
||||
registerJBossLogger(hints, "org.hibernate.jpa.internal.JpaLogger_$logger");
|
||||
registerJBossLogger(hints, "org.hibernate.internal.CoreMessageLogger_$logger");
|
||||
registerJBossLogger(hints, "org.hibernate.internal.EntityManagerMessageLogger_$logger");
|
||||
|
||||
// AWS SDK v2 — HTTP client and SdkPojo serialization
|
||||
hints.resources().registerPattern("software/amazon/awssdk/global/handlers/execution.interceptors");
|
||||
hints.resources().registerPattern("software/amazon/awssdk/services/s3/execution.interceptors");
|
||||
hints.resources().registerPattern("codegen-resources/s3/*");
|
||||
hints.reflection().registerType(
|
||||
software.amazon.awssdk.services.s3.S3Client.class,
|
||||
MemberCategory.INVOKE_PUBLIC_METHODS
|
||||
);
|
||||
}
|
||||
|
||||
private void registerJBossLogger(RuntimeHints hints, String className) {
|
||||
hints.reflection().registerType(
|
||||
TypeReference.of(className),
|
||||
MemberCategory.INVOKE_PUBLIC_CONSTRUCTORS,
|
||||
MemberCategory.INVOKE_PUBLIC_METHODS
|
||||
);
|
||||
}
|
||||
}
|
||||
@@ -20,7 +20,9 @@ public class SecurityConfig {
|
||||
@Bean
|
||||
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
|
||||
http
|
||||
.authorizeHttpRequests(auth -> auth.anyRequest().authenticated())
|
||||
.authorizeHttpRequests(auth -> auth
|
||||
.requestMatchers("/api/v1/figures/**").permitAll()
|
||||
.anyRequest().authenticated())
|
||||
.httpBasic(Customizer.withDefaults())
|
||||
.csrf(AbstractHttpConfigurer::disable);
|
||||
return http.build();
|
||||
@@ -28,9 +30,10 @@ public class SecurityConfig {
|
||||
|
||||
@Bean
|
||||
public UserDetailsService userDetailsService(
|
||||
@Value("${app.auth.username}") String username,
|
||||
@Value("${app.auth.password}") String password) {
|
||||
UserDetails user = User.builder()
|
||||
.username("neurosurgeon")
|
||||
.username(username)
|
||||
.password("{noop}" + password)
|
||||
.roles("USER")
|
||||
.build();
|
||||
|
||||
@@ -0,0 +1,47 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import jakarta.persistence.*;
|
||||
import java.time.Instant;
|
||||
import java.util.UUID;
|
||||
|
||||
@Entity
|
||||
@Table(name = "chapter")
|
||||
public class ChapterEntity {
|
||||
|
||||
@Id
|
||||
@Column(name = "id", length = 200)
|
||||
private String id;
|
||||
|
||||
@Column(name = "book_id", nullable = false)
|
||||
private UUID bookId;
|
||||
|
||||
@Column(name = "number", nullable = false)
|
||||
private int number;
|
||||
|
||||
@Column(name = "title", length = 500)
|
||||
private String title;
|
||||
|
||||
@Column(name = "page_start")
|
||||
private Integer pageStart;
|
||||
|
||||
@Column(name = "created_at", nullable = false)
|
||||
private Instant createdAt;
|
||||
|
||||
public ChapterEntity() {}
|
||||
|
||||
public ChapterEntity(String id, UUID bookId, int number, String title, Integer pageStart) {
|
||||
this.id = id;
|
||||
this.bookId = bookId;
|
||||
this.number = number;
|
||||
this.title = title;
|
||||
this.pageStart = pageStart;
|
||||
this.createdAt = Instant.now();
|
||||
}
|
||||
|
||||
public String getId() { return id; }
|
||||
public UUID getBookId() { return bookId; }
|
||||
public int getNumber() { return number; }
|
||||
public String getTitle() { return title; }
|
||||
public Integer getPageStart() { return pageStart; }
|
||||
public Instant getCreatedAt() { return createdAt; }
|
||||
}
|
||||
@@ -0,0 +1,9 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.springframework.data.jpa.repository.JpaRepository;
|
||||
|
||||
import java.util.UUID;
|
||||
|
||||
public interface ChapterRepository extends JpaRepository<ChapterEntity, String> {
|
||||
void deleteAllByBookId(UUID bookId);
|
||||
}
|
||||
@@ -0,0 +1,58 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import jakarta.persistence.*;
|
||||
import java.io.Serializable;
|
||||
import java.util.Objects;
|
||||
import java.util.UUID;
|
||||
|
||||
@Entity
|
||||
@Table(name = "chunk_figure_ref")
|
||||
@IdClass(ChunkFigureRefEntity.PK.class)
|
||||
public class ChunkFigureRefEntity {
|
||||
|
||||
@Id
|
||||
@Column(name = "chunk_id", nullable = false)
|
||||
private UUID chunkId;
|
||||
|
||||
@Id
|
||||
@Column(name = "figure_id", nullable = false, length = 200)
|
||||
private String figureId;
|
||||
|
||||
@Column(name = "mention_page")
|
||||
private Integer mentionPage;
|
||||
|
||||
public ChunkFigureRefEntity() {}
|
||||
|
||||
public ChunkFigureRefEntity(UUID chunkId, String figureId, Integer mentionPage) {
|
||||
this.chunkId = chunkId;
|
||||
this.figureId = figureId;
|
||||
this.mentionPage = mentionPage;
|
||||
}
|
||||
|
||||
public UUID getChunkId() { return chunkId; }
|
||||
public String getFigureId() { return figureId; }
|
||||
public Integer getMentionPage() { return mentionPage; }
|
||||
|
||||
public static class PK implements Serializable {
|
||||
private UUID chunkId;
|
||||
private String figureId;
|
||||
|
||||
public PK() {}
|
||||
public PK(UUID chunkId, String figureId) {
|
||||
this.chunkId = chunkId;
|
||||
this.figureId = figureId;
|
||||
}
|
||||
|
||||
@Override
|
||||
public boolean equals(Object o) {
|
||||
if (this == o) return true;
|
||||
if (!(o instanceof PK pk)) return false;
|
||||
return Objects.equals(chunkId, pk.chunkId) && Objects.equals(figureId, pk.figureId);
|
||||
}
|
||||
|
||||
@Override
|
||||
public int hashCode() {
|
||||
return Objects.hash(chunkId, figureId);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,18 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.springframework.data.jpa.repository.JpaRepository;
|
||||
import org.springframework.data.jpa.repository.Query;
|
||||
import org.springframework.data.repository.query.Param;
|
||||
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
public interface ChunkFigureRefRepository extends JpaRepository<ChunkFigureRefEntity, ChunkFigureRefEntity.PK> {
|
||||
|
||||
@Query("SELECT r FROM ChunkFigureRefEntity r WHERE r.chunkId IN :chunkIds")
|
||||
List<ChunkFigureRefEntity> findByChunkIdIn(@Param("chunkIds") List<UUID> chunkIds);
|
||||
|
||||
@Query("DELETE FROM ChunkFigureRefEntity r WHERE r.figureId IN :figureIds")
|
||||
@org.springframework.data.jpa.repository.Modifying
|
||||
void deleteByFigureIdIn(@Param("figureIds") List<String> figureIds);
|
||||
}
|
||||
@@ -0,0 +1,62 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.ai.document.Document;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
import java.util.regex.Matcher;
|
||||
import java.util.regex.Pattern;
|
||||
|
||||
/**
|
||||
* Scans chunk text for "Fig. X" and "Figure X" references and persists
|
||||
* ChunkFigureRefEntity rows linking that chunk to its referenced figures.
|
||||
*/
|
||||
@Service
|
||||
public class ChunkFigureRefService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(ChunkFigureRefService.class);
|
||||
|
||||
// Matches: "Fig. 12-4", "Fig. 12.4", "Fig 12", "Figure 12-4", etc.
|
||||
private static final Pattern REF_PATTERN =
|
||||
Pattern.compile("(?i)\\b(Fig\\.?|Figure)\\s+(\\d+[\\-.\\d]*)");
|
||||
|
||||
private final ChunkFigureRefRepository refRepository;
|
||||
|
||||
public ChunkFigureRefService(ChunkFigureRefRepository refRepository) {
|
||||
this.refRepository = refRepository;
|
||||
}
|
||||
|
||||
/**
|
||||
* For each text chunk, finds figure references and persists ChunkFigureRefEntity rows.
|
||||
*/
|
||||
public void linkChunksToFigures(List<Document> chunks, List<FigureEntity> bookFigures,
|
||||
int pageNum) {
|
||||
if (bookFigures.isEmpty()) return;
|
||||
|
||||
for (Document chunk : chunks) {
|
||||
String chunkIdStr = chunk.getId();
|
||||
UUID chunkId;
|
||||
try {
|
||||
chunkId = UUID.fromString(chunkIdStr);
|
||||
} catch (IllegalArgumentException ex) {
|
||||
log.warn("Chunk has non-UUID id: {}", chunkIdStr);
|
||||
continue;
|
||||
}
|
||||
|
||||
Matcher m = REF_PATTERN.matcher(chunk.getText());
|
||||
while (m.find()) {
|
||||
String refNum = m.group(2).trim();
|
||||
// Find matching figure by label suffix
|
||||
for (FigureEntity figure : bookFigures) {
|
||||
if (figure.getLabel() != null && figure.getLabel().endsWith(refNum)) {
|
||||
refRepository.save(new ChunkFigureRefEntity(chunkId, figure.getId(), pageNum));
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,82 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import jakarta.persistence.*;
|
||||
import java.time.Instant;
|
||||
import java.util.UUID;
|
||||
|
||||
@Entity
|
||||
@Table(name = "figure")
|
||||
public class FigureEntity {
|
||||
|
||||
@Id
|
||||
@Column(name = "id", length = 200)
|
||||
private String id;
|
||||
|
||||
@Column(name = "book_id", nullable = false)
|
||||
private UUID bookId;
|
||||
|
||||
@Column(name = "section_id", length = 200)
|
||||
private String sectionId;
|
||||
|
||||
@Column(name = "chapter_id", length = 200)
|
||||
private String chapterId;
|
||||
|
||||
@Column(name = "label", length = 100)
|
||||
private String label;
|
||||
|
||||
@Column(name = "caption", columnDefinition = "TEXT")
|
||||
private String caption;
|
||||
|
||||
@Enumerated(EnumType.STRING)
|
||||
@Column(name = "figure_type", nullable = false, length = 50)
|
||||
private FigureType figureType;
|
||||
|
||||
@Column(name = "page", nullable = false)
|
||||
private int page;
|
||||
|
||||
@Column(name = "image_path", nullable = false, length = 1000)
|
||||
private String imagePath;
|
||||
|
||||
@Column(name = "caption_embedding_id")
|
||||
private UUID captionEmbeddingId;
|
||||
|
||||
@Column(name = "created_at", nullable = false)
|
||||
private Instant createdAt;
|
||||
|
||||
public FigureEntity() {}
|
||||
|
||||
public FigureEntity(String id, UUID bookId, String sectionId, String chapterId,
|
||||
String label, String caption, FigureType figureType,
|
||||
int page, String imagePath) {
|
||||
this.id = id;
|
||||
this.bookId = bookId;
|
||||
this.sectionId = sectionId;
|
||||
this.chapterId = chapterId;
|
||||
this.label = label;
|
||||
this.caption = caption;
|
||||
this.figureType = figureType;
|
||||
this.page = page;
|
||||
this.imagePath = imagePath;
|
||||
this.createdAt = Instant.now();
|
||||
}
|
||||
|
||||
public String getId() { return id; }
|
||||
public UUID getBookId() { return bookId; }
|
||||
public String getSectionId() { return sectionId; }
|
||||
public String getChapterId() { return chapterId; }
|
||||
public String getLabel() { return label; }
|
||||
public String getCaption() { return caption; }
|
||||
public FigureType getFigureType() { return figureType; }
|
||||
public int getPage() { return page; }
|
||||
public String getImagePath() { return imagePath; }
|
||||
public UUID getCaptionEmbeddingId() { return captionEmbeddingId; }
|
||||
public Instant getCreatedAt() { return createdAt; }
|
||||
|
||||
public void setCaptionEmbeddingId(UUID captionEmbeddingId) {
|
||||
this.captionEmbeddingId = captionEmbeddingId;
|
||||
}
|
||||
|
||||
public void setCaption(String caption) {
|
||||
this.caption = caption;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,151 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import com.aiteacher.figure.FigureStorageService;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import javax.imageio.ImageIO;
|
||||
import java.awt.image.BufferedImage;
|
||||
import java.io.ByteArrayInputStream;
|
||||
import java.io.IOException;
|
||||
import java.util.ArrayList;
|
||||
import java.util.HashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.UUID;
|
||||
import java.util.regex.Matcher;
|
||||
import java.util.regex.Pattern;
|
||||
|
||||
/**
|
||||
* Extracts figure images from {@link PageResult.FigureData} entries produced by
|
||||
* {@link MarkerPageParser}.
|
||||
*
|
||||
* <p>Marker returns pre-cropped PNG bytes for each detected figure, so no PDFBox
|
||||
* page rendering or bounding-box cropping is needed. This service:
|
||||
* <ol>
|
||||
* <li>Decodes the PNG bytes to check dimensions (skip images below min size)</li>
|
||||
* <li>Classifies the figure type from caption and surrounding text keywords</li>
|
||||
* <li>Persists the image via {@link FigureStorageService}</li>
|
||||
* <li>Persists a {@link FigureEntity} to the database</li>
|
||||
* </ol>
|
||||
*/
|
||||
@Service
|
||||
public class FigureExtractionService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(FigureExtractionService.class);
|
||||
|
||||
private static final Pattern LABEL_PATTERN =
|
||||
Pattern.compile("(?i)Fig\\.?\\s*(\\d+[\\-.\\d]*)");
|
||||
|
||||
private final FigureStorageService storageService;
|
||||
private final FigureRepository figureRepository;
|
||||
private final int minImageSizePx;
|
||||
|
||||
public FigureExtractionService(
|
||||
FigureStorageService storageService,
|
||||
FigureRepository figureRepository,
|
||||
@Value("${app.figure-storage.min-image-size-px:100}") int minImageSizePx) {
|
||||
this.storageService = storageService;
|
||||
this.figureRepository = figureRepository;
|
||||
this.minImageSizePx = minImageSizePx;
|
||||
}
|
||||
|
||||
/** Holds the extraction output: persisted figures and a Marker blockId → DB figureId map. */
|
||||
public record ExtractionResult(List<FigureEntity> figures, Map<String, String> blockIdToFigureId) {}
|
||||
|
||||
/**
|
||||
* Extracts and persists figures for all pages described by {@code pageResults}.
|
||||
*
|
||||
* @param bookId owning book
|
||||
* @param chapterId chapter bucket for these sections
|
||||
* @param pageResults Marker parse output — each entry's {@code figures} list
|
||||
* carries pre-cropped PNG bytes for that page
|
||||
* @return {@link ExtractionResult} with persisted figures and blockId→figureId map
|
||||
* (used to resolve markdown image placeholders)
|
||||
*/
|
||||
public ExtractionResult extract(UUID bookId, String chapterId,
|
||||
List<PageResult> pageResults) {
|
||||
List<FigureEntity> figures = new ArrayList<>();
|
||||
Map<String, String> blockIdToFigureId = new HashMap<>();
|
||||
int figureCounter = 0;
|
||||
|
||||
for (PageResult page : pageResults) {
|
||||
if (page.figures().isEmpty()) continue;
|
||||
|
||||
for (PageResult.FigureData figureData : page.figures()) {
|
||||
try {
|
||||
BufferedImage image = decodeImage(figureData.imageBytes());
|
||||
if (image == null) {
|
||||
log.debug("Could not decode image on page {} of book {} (block {})",
|
||||
page.pageNumber(), bookId, figureData.blockId());
|
||||
continue;
|
||||
}
|
||||
if (image.getWidth() < minImageSizePx || image.getHeight() < minImageSizePx) {
|
||||
log.debug("Skipping small figure on page {} ({}×{})",
|
||||
page.pageNumber(), image.getWidth(), image.getHeight());
|
||||
continue;
|
||||
}
|
||||
|
||||
figureCounter++;
|
||||
String figureId = bookId + "-fig-" + page.pageNumber() + "-" + figureCounter;
|
||||
String caption = figureData.nearestCaption();
|
||||
String label = detectLabel(caption, figureCounter);
|
||||
FigureType type = classifyType(caption, page.orderedText());
|
||||
|
||||
String sectionId = bookId + "-p" + page.pageNumber();
|
||||
String imagePath = storageService.save(bookId, figureId, image);
|
||||
|
||||
FigureEntity figure = new FigureEntity(
|
||||
figureId, bookId, sectionId, chapterId,
|
||||
label, caption, type, page.pageNumber(), imagePath);
|
||||
figures.add(figureRepository.save(figure));
|
||||
blockIdToFigureId.put(figureData.blockId(), figureId);
|
||||
|
||||
} catch (Exception ex) {
|
||||
log.warn("Failed to extract figure on page {} of book {}: {}",
|
||||
page.pageNumber(), bookId, ex.getMessage());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
log.info("Extracted {} figures for book {}", figures.size(), bookId);
|
||||
return new ExtractionResult(figures, blockIdToFigureId);
|
||||
}
|
||||
|
||||
// --- Private helpers ---
|
||||
|
||||
private BufferedImage decodeImage(byte[] imageBytes) {
|
||||
if (imageBytes == null || imageBytes.length == 0) return null;
|
||||
try {
|
||||
return ImageIO.read(new ByteArrayInputStream(imageBytes));
|
||||
} catch (IOException ex) {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
private String detectLabel(String caption, int counter) {
|
||||
if (caption != null) {
|
||||
Matcher m = LABEL_PATTERN.matcher(caption);
|
||||
if (m.find()) return "Fig. " + m.group(1).trim();
|
||||
}
|
||||
return "Fig. " + counter;
|
||||
}
|
||||
|
||||
private FigureType classifyType(String caption, String pageText) {
|
||||
String combined = ((caption != null ? caption : "") + " " +
|
||||
(pageText != null ? pageText : "")).toLowerCase();
|
||||
if (combined.contains("mri") || combined.contains("ct ") || combined.contains("magnetic")
|
||||
|| combined.contains("tomography")) return FigureType.MRI_CT_SCAN;
|
||||
if (combined.contains("intraoperative") || combined.contains("intra-op"))
|
||||
return FigureType.INTRAOPERATIVE_IMAGE;
|
||||
if (caption != null && caption.toLowerCase().startsWith("table"))
|
||||
return FigureType.TABLE;
|
||||
if (combined.contains("chart") || combined.contains("histogram") || combined.contains("graph"))
|
||||
return FigureType.CHART;
|
||||
if (combined.contains("photograph") || combined.contains("photo"))
|
||||
return FigureType.SURGICAL_PHOTOGRAPH;
|
||||
return FigureType.ANATOMICAL_DIAGRAM;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,11 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.springframework.data.jpa.repository.JpaRepository;
|
||||
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
public interface FigureRepository extends JpaRepository<FigureEntity, String> {
|
||||
List<FigureEntity> findAllByBookId(UUID bookId);
|
||||
void deleteAllByBookId(UUID bookId);
|
||||
}
|
||||
@@ -0,0 +1,10 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
public enum FigureType {
|
||||
ANATOMICAL_DIAGRAM,
|
||||
SURGICAL_PHOTOGRAPH,
|
||||
MRI_CT_SCAN,
|
||||
TABLE,
|
||||
CHART,
|
||||
INTRAOPERATIVE_IMAGE
|
||||
}
|
||||
@@ -0,0 +1,14 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import java.util.UUID;
|
||||
|
||||
public interface MarkdownStorageService {
|
||||
/** Uploads the markdown content and returns the S3 key. */
|
||||
String save(UUID bookId, int pageNumber, String markdown);
|
||||
|
||||
/** Downloads and returns the markdown content for the given book and page. */
|
||||
String getText(UUID bookId, int pageNumber);
|
||||
|
||||
/** Deletes all markdown files for the given book. */
|
||||
void deleteAll(UUID bookId);
|
||||
}
|
||||
@@ -0,0 +1,335 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import tools.jackson.databind.JsonNode;
|
||||
import tools.jackson.databind.ObjectMapper;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.beans.factory.annotation.Qualifier;
|
||||
import org.springframework.core.io.FileSystemResource;
|
||||
import org.springframework.http.MediaType;
|
||||
import org.springframework.stereotype.Service;
|
||||
import org.springframework.util.LinkedMultiValueMap;
|
||||
import org.springframework.util.MultiValueMap;
|
||||
import org.springframework.web.client.RestClient;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.*;
|
||||
|
||||
|
||||
/**
|
||||
* Parses a PDF with a single call to the Marker server using {@code output_format=json}.
|
||||
*
|
||||
* <p>The JSON response contains an {@code output} field that is itself a JSON string with a
|
||||
* tree structure: the root has a {@code children} array where each item is a {@code Page} block.
|
||||
* Each block carries an {@code html} field with {@code <content-ref src='blockId'>} placeholders
|
||||
* that reference its {@code children} by ID.
|
||||
*
|
||||
* <p>{@link #jsonToHtml} mirrors the Marker Python {@code json_to_html} utility: it walks the
|
||||
* tree recursively and resolves every {@code content-ref} with the rendered HTML of the
|
||||
* referenced child block.
|
||||
*
|
||||
* <p>Returns a {@link ParsedBook} with:
|
||||
* <ul>
|
||||
* <li>{@code pages} — one {@link PageResult} per non-empty page (drives embeddings)</li>
|
||||
* <li>{@code htmlByPage} — full resolved HTML per page (saved to S3 for the reader)</li>
|
||||
* </ul>
|
||||
*/
|
||||
@Service
|
||||
public class MarkerPageParser {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(MarkerPageParser.class);
|
||||
|
||||
private static final Set<String> TEXT_BLOCK_TYPES = Set.of(
|
||||
"Text", "TextInlineMath", "ListItem", "Table", "TableOfContents", "Code", "Equation",
|
||||
"Footnote", "Caption", "PageHeader", "PageFooter", "Handwriting"
|
||||
);
|
||||
private static final Set<String> FIGURE_BLOCK_TYPES = Set.of("Figure", "Picture", "FigureGroup", "PictureGroup");
|
||||
|
||||
private static final int CHUNK_SIZE = 100;
|
||||
|
||||
private static final ObjectMapper MAPPER = new ObjectMapper();
|
||||
|
||||
private final RestClient restClient;
|
||||
private final PdfSplitterService pdfSplitterService;
|
||||
|
||||
public MarkerPageParser(@Qualifier("markerRestClient") RestClient restClient,
|
||||
PdfSplitterService pdfSplitterService) {
|
||||
this.restClient = restClient;
|
||||
this.pdfSplitterService = pdfSplitterService;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parses a PDF by splitting it into {@value #CHUNK_SIZE}-page chunks, submitting each
|
||||
* chunk to Marker individually, and merging the results into a single {@link ParsedBook}.
|
||||
* Page numbers in the merged result are absolute (1-based across the whole document).
|
||||
*/
|
||||
public ParsedBook parse(Path pdfPath) throws IOException {
|
||||
List<PdfSplitterService.PdfChunk> chunks = pdfSplitterService.split(pdfPath, CHUNK_SIZE);
|
||||
log.info("Processing {} chunk(s) for {}", chunks.size(), pdfPath.getFileName());
|
||||
|
||||
List<PageResult> allPages = new ArrayList<>();
|
||||
Map<Integer, String> allHtml = new LinkedHashMap<>();
|
||||
|
||||
try {
|
||||
for (int c = 0; c < chunks.size(); c++) {
|
||||
PdfSplitterService.PdfChunk chunk = chunks.get(c);
|
||||
log.info("Submitting chunk {}/{} to Marker (page offset {})", c + 1, chunks.size(), chunk.pageOffset());
|
||||
|
||||
ParsedBook chunkResult = submitChunk(chunk.tempFile());
|
||||
|
||||
// Rebase page numbers from chunk-relative to document-absolute
|
||||
for (PageResult page : chunkResult.pages()) {
|
||||
int absolutePage = chunk.pageOffset() + page.pageNumber();
|
||||
allPages.add(new PageResult(absolutePage, page.orderedText(), page.headingTitle(), page.figures()));
|
||||
}
|
||||
chunkResult.htmlByPage().forEach((chunkPage, html) ->
|
||||
allHtml.put(chunk.pageOffset() + chunkPage, html));
|
||||
}
|
||||
} finally {
|
||||
// Delete temporary chunk files (skip if the chunk is the original PDF)
|
||||
for (PdfSplitterService.PdfChunk chunk : chunks) {
|
||||
if (!chunk.tempFile().equals(pdfPath)) {
|
||||
try { Files.deleteIfExists(chunk.tempFile()); }
|
||||
catch (IOException e) { log.warn("Could not delete temp chunk {}", chunk.tempFile()); }
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
log.info("Marker produced {} non-empty pages from {} chunk(s) of {}",
|
||||
allPages.size(), chunks.size(), pdfPath.getFileName());
|
||||
return new ParsedBook(allPages, allHtml);
|
||||
}
|
||||
|
||||
/** Submits a single PDF file to Marker and returns the parsed result with chunk-relative page numbers. */
|
||||
private ParsedBook submitChunk(Path chunkPath) {
|
||||
MultiValueMap<String, Object> body = new LinkedMultiValueMap<>();
|
||||
body.add("file", new FileSystemResource(chunkPath));
|
||||
body.add("output_format", "json");
|
||||
|
||||
JsonNode response = restClient.post()
|
||||
.uri("/marker/upload")
|
||||
.contentType(MediaType.MULTIPART_FORM_DATA)
|
||||
.body(body)
|
||||
.retrieve()
|
||||
.body(JsonNode.class);
|
||||
|
||||
try {
|
||||
Files.writeString(Path.of("/tmp/marker-response-json.json"), response.toPrettyString());
|
||||
} catch (IOException e) {
|
||||
log.warn("Could not save Marker response to /tmp/marker-response-json.json", e);
|
||||
}
|
||||
|
||||
List<JsonNode> pageNodes = extractPages(response);
|
||||
if (pageNodes.isEmpty()) {
|
||||
log.warn("Marker returned no pages for chunk {}", chunkPath.getFileName());
|
||||
return new ParsedBook(List.of(), Map.of());
|
||||
}
|
||||
|
||||
List<PageResult> pages = new ArrayList<>();
|
||||
Map<Integer, String> htmlByPage = new LinkedHashMap<>();
|
||||
|
||||
for (int i = 0; i < pageNodes.size(); i++) {
|
||||
JsonNode pageNode = pageNodes.get(i);
|
||||
int pageNumber = i + 1; // 1-based, chunk-relative
|
||||
|
||||
PageResult result = buildPageResult(pageNode, pageNumber);
|
||||
String html = jsonToHtml(pageNode);
|
||||
|
||||
// Always save HTML so the reader can navigate to every page
|
||||
htmlByPage.put(pageNumber, html);
|
||||
|
||||
// Only queue for embedding if the page has extractable content
|
||||
if (!result.orderedText().isBlank() || !result.figures().isEmpty()) {
|
||||
pages.add(result);
|
||||
}
|
||||
}
|
||||
|
||||
return new ParsedBook(pages, htmlByPage);
|
||||
}
|
||||
|
||||
// ── Page extraction ───────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Parses the {@code output} JSON string and returns the list of page nodes
|
||||
* (the top-level {@code children} of the document root).
|
||||
*/
|
||||
private List<JsonNode> extractPages(JsonNode response) {
|
||||
if (response == null) return List.of();
|
||||
JsonNode outputNode = response.path("output");
|
||||
if (outputNode.isMissingNode()) {
|
||||
log.warn("Marker response has no 'output' field");
|
||||
return List.of();
|
||||
}
|
||||
try {
|
||||
JsonNode root = MAPPER.readTree(outputNode.stringValue());
|
||||
JsonNode children = root.path("children");
|
||||
if (children.isMissingNode() || !children.isArray()) {
|
||||
log.warn("Marker output root has no 'children' array");
|
||||
return List.of();
|
||||
}
|
||||
List<JsonNode> result = new ArrayList<>();
|
||||
children.forEach(result::add);
|
||||
return result;
|
||||
} catch (Exception e) {
|
||||
log.warn("Could not parse Marker 'output' string as JSON: {}", e.getMessage());
|
||||
return List.of();
|
||||
}
|
||||
}
|
||||
|
||||
// ── HTML rendering ────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Java equivalent of the Marker Python {@code json_to_html} utility.
|
||||
*
|
||||
* <p>Algorithm:
|
||||
* <ol>
|
||||
* <li>If the block has no children, return its {@code html} as-is (leaf node).</li>
|
||||
* <li>Otherwise recursively render each child, then replace every
|
||||
* {@code <content-ref src='childId'>} placeholder in the block's own {@code html}
|
||||
* with the rendered child HTML.</li>
|
||||
* </ol>
|
||||
*/
|
||||
String jsonToHtml(JsonNode block) {
|
||||
String html = str(block.path("html"));
|
||||
|
||||
// If the block carries image data, inject <img> data-URI tags.
|
||||
// Marker stores base64 image bytes in block.images keyed by block ID.
|
||||
// Picture/Figure leaf blocks have empty html, so this is the only way to
|
||||
// get the image into the rendered output.
|
||||
JsonNode images = block.path("images");
|
||||
if (!images.isMissingNode() && !images.isNull() && !images.isEmpty()) {
|
||||
StringBuilder imgTags = new StringBuilder();
|
||||
images.properties().forEach(entry -> {
|
||||
String base64 = str(entry.getValue());
|
||||
if (!base64.isEmpty()) {
|
||||
String mime = detectImageMime(base64);
|
||||
imgTags.append("<img src=\"data:").append(mime)
|
||||
.append(";base64,").append(base64).append("\">");
|
||||
}
|
||||
});
|
||||
if (!imgTags.isEmpty()) {
|
||||
html = html + imgTags;
|
||||
}
|
||||
}
|
||||
|
||||
JsonNode children = block.path("children");
|
||||
if (children.isMissingNode() || children.isNull() || !children.isArray() || children.isEmpty()) {
|
||||
return html; // leaf node
|
||||
}
|
||||
|
||||
// Build id → rendered-html map for all direct children
|
||||
Map<String, String> childHtml = new LinkedHashMap<>();
|
||||
for (JsonNode child : children) {
|
||||
String id = str(child.path("id"));
|
||||
childHtml.put(id, jsonToHtml(child));
|
||||
}
|
||||
|
||||
// Replace every <content-ref src='id'></content-ref> with the child's HTML
|
||||
for (Map.Entry<String, String> entry : childHtml.entrySet()) {
|
||||
String ref = "<content-ref src='" + entry.getKey() + "'></content-ref>";
|
||||
html = html.replace(ref, entry.getValue());
|
||||
}
|
||||
|
||||
return html;
|
||||
}
|
||||
|
||||
// ── PageResult (text + figures for embeddings) ────────────────────────────
|
||||
|
||||
private PageResult buildPageResult(JsonNode pageBlock, int pageNumber) {
|
||||
StringBuilder text = new StringBuilder();
|
||||
String[] headingTitle = {null};
|
||||
List<PageResult.FigureData> figures = new ArrayList<>();
|
||||
|
||||
walkBlock(pageBlock, text, headingTitle, figures);
|
||||
return new PageResult(pageNumber, text.toString().strip(), headingTitle[0], figures);
|
||||
}
|
||||
|
||||
/** Recursively walks the block tree, collecting text and figures in reading order. */
|
||||
private void walkBlock(JsonNode block, StringBuilder text, String[] headingTitle,
|
||||
List<PageResult.FigureData> figures) {
|
||||
String type = str(block.path("block_type"));
|
||||
|
||||
if ("SectionHeader".equals(type)) {
|
||||
String heading = stripHtml(str(block.path("html"))).strip();
|
||||
if (!heading.isEmpty() && headingTitle[0] == null) headingTitle[0] = heading;
|
||||
appendText(text, heading);
|
||||
|
||||
} else if (TEXT_BLOCK_TYPES.contains(type)) {
|
||||
appendText(text, stripHtml(str(block.path("html"))));
|
||||
|
||||
} else if (FIGURE_BLOCK_TYPES.contains(type)) {
|
||||
String caption = findCaption(block);
|
||||
extractFigures(block, caption, figures);
|
||||
}
|
||||
|
||||
// Recurse into children (content-ref ordering is implicit via tree order)
|
||||
JsonNode children = block.path("children");
|
||||
if (!children.isMissingNode() && !children.isNull() && children.isArray()) {
|
||||
for (JsonNode child : children) {
|
||||
walkBlock(child, text, headingTitle, figures);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/** Finds the first Caption child inside a figure block, if any. */
|
||||
private String findCaption(JsonNode figureBlock) {
|
||||
JsonNode children = figureBlock.path("children");
|
||||
if (children.isMissingNode() || !children.isArray()) return null;
|
||||
for (JsonNode child : children) {
|
||||
if ("Caption".equals(str(child.path("block_type")))) {
|
||||
String caption = stripHtml(str(child.path("html"))).strip();
|
||||
return caption.isEmpty() ? null : caption;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
private void extractFigures(JsonNode block, String caption, List<PageResult.FigureData> out) {
|
||||
JsonNode images = block.path("images");
|
||||
if (images.isMissingNode() || images.isEmpty()) return;
|
||||
|
||||
images.properties().forEach(entry -> {
|
||||
String blockId = entry.getKey();
|
||||
String base64 = str(entry.getValue());
|
||||
if (base64.isEmpty()) return;
|
||||
try {
|
||||
byte[] bytes = Base64.getDecoder().decode(base64);
|
||||
out.add(new PageResult.FigureData(bytes, caption, blockId));
|
||||
} catch (IllegalArgumentException ex) {
|
||||
log.warn("Could not decode base64 image for block {}: {}", blockId, ex.getMessage());
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
// ── Utilities ─────────────────────────────────────────────────────────────
|
||||
|
||||
private void appendText(StringBuilder sb, String text) {
|
||||
if (text == null) return;
|
||||
String stripped = text.strip();
|
||||
if (stripped.isEmpty()) return;
|
||||
if (sb.length() > 0) sb.append("\n\n");
|
||||
sb.append(stripped);
|
||||
}
|
||||
|
||||
private String stripHtml(String html) {
|
||||
if (html == null || html.isEmpty()) return "";
|
||||
return html.replaceAll("<[^>]*>", "").replaceAll("\\s{2,}", " ").strip();
|
||||
}
|
||||
|
||||
/** Detects MIME type from the first characters of a base64-encoded image. */
|
||||
private static String detectImageMime(String base64) {
|
||||
if (base64.startsWith("/9j/")) return "image/jpeg";
|
||||
if (base64.startsWith("iVBOR")) return "image/png";
|
||||
if (base64.startsWith("R0lGO")) return "image/gif";
|
||||
if (base64.startsWith("UklGR")) return "image/webp";
|
||||
return "image/png"; // safe fallback
|
||||
}
|
||||
|
||||
/** Null-safe string extraction from a JsonNode (Jackson 3: stringValue() returns null for non-strings). */
|
||||
private static String str(JsonNode node) {
|
||||
String v = node.stringValue();
|
||||
return v != null ? v : "";
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,25 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Internal DTO produced by MarkerPageParser for one PDF page.
|
||||
* Decouples the Marker HTTP API from downstream services.
|
||||
*/
|
||||
public record PageResult(
|
||||
int pageNumber, // 1-based, derived from Marker page block index
|
||||
String orderedText, // full page text in correct reading order (blocks joined by \n\n)
|
||||
String headingTitle, // first SectionHeader block on page, or null
|
||||
List<FigureData> figures // extracted figure images (may be empty)
|
||||
) {
|
||||
|
||||
/**
|
||||
* A figure extracted from the page.
|
||||
* Image bytes are PNG data decoded from the Marker JSON {@code images} map.
|
||||
*/
|
||||
public record FigureData(
|
||||
byte[] imageBytes, // PNG image data (base64-decoded from Marker response)
|
||||
String nearestCaption, // text of the adjacent Caption block, or null
|
||||
String blockId // Marker block ID (e.g. "/page/0/Figure/2") for traceability
|
||||
) {}
|
||||
}
|
||||
@@ -0,0 +1,16 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
|
||||
/**
|
||||
* Result of a full Marker parse: structured page data (from JSON) plus
|
||||
* native per-page markdown (from the separate Markdown API call).
|
||||
*
|
||||
* @param pages one entry per non-empty page, derived from the chunks response
|
||||
* @param htmlByPage concatenated block HTML keyed by 1-based page number
|
||||
*/
|
||||
public record ParsedBook(
|
||||
List<PageResult> pages,
|
||||
Map<Integer, String> htmlByPage
|
||||
) {}
|
||||
@@ -0,0 +1,72 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.apache.pdfbox.io.RandomAccessReadBufferedFile;
|
||||
import org.apache.pdfbox.multipdf.Splitter;
|
||||
import org.apache.pdfbox.pdfparser.PDFParser;
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Splits a PDF file into fixed-size chunks using PDFBox.
|
||||
* Each chunk is saved as a temporary file so it can be submitted independently to Marker.
|
||||
*/
|
||||
@Service
|
||||
public class PdfSplitterService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(PdfSplitterService.class);
|
||||
|
||||
/**
|
||||
* A chunk of a split PDF.
|
||||
*
|
||||
* @param tempFile path to the temporary PDF file (caller must delete when done)
|
||||
* @param pageOffset 0-based index of the first page in this chunk within the original document
|
||||
*/
|
||||
public record PdfChunk(Path tempFile, int pageOffset) {}
|
||||
|
||||
/**
|
||||
* Splits {@code pdfPath} into chunks of at most {@code maxPagesPerChunk} pages.
|
||||
* Returns a single-element list when the document fits in one chunk.
|
||||
*
|
||||
* @param pdfPath source PDF
|
||||
* @param maxPagesPerChunk maximum pages per chunk
|
||||
* @return ordered list of chunks; caller is responsible for deleting {@code tempFile}s
|
||||
*/
|
||||
public List<PdfChunk> split(Path pdfPath, int maxPagesPerChunk) throws IOException {
|
||||
try (PDDocument doc = new PDFParser(new RandomAccessReadBufferedFile(pdfPath.toFile())).parse()) {
|
||||
int totalPages = doc.getNumberOfPages();
|
||||
log.info("PDF {} has {} pages, splitting into chunks of {}", pdfPath.getFileName(), totalPages, maxPagesPerChunk);
|
||||
|
||||
if (totalPages <= maxPagesPerChunk) {
|
||||
// No split needed — return the original file as a single virtual chunk
|
||||
return List.of(new PdfChunk(pdfPath, 0));
|
||||
}
|
||||
|
||||
Splitter splitter = new Splitter();
|
||||
splitter.setSplitAtPage(maxPagesPerChunk);
|
||||
List<PDDocument> parts = splitter.split(doc);
|
||||
|
||||
List<PdfChunk> chunks = new ArrayList<>(parts.size());
|
||||
int offset = 0;
|
||||
for (PDDocument part : parts) {
|
||||
try {
|
||||
Path tmp = Files.createTempFile("marker-chunk-", ".pdf");
|
||||
part.save(tmp.toFile());
|
||||
chunks.add(new PdfChunk(tmp, offset));
|
||||
log.debug("Created chunk at {} (page offset {})", tmp, offset);
|
||||
offset += part.getNumberOfPages();
|
||||
} finally {
|
||||
part.close();
|
||||
}
|
||||
}
|
||||
return chunks;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,114 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.apache.pdfbox.Loader;
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.apache.pdfbox.pdmodel.PDPage;
|
||||
import org.apache.pdfbox.pdmodel.common.PDRectangle;
|
||||
import org.apache.pdfbox.text.PDFTextStripperByArea;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.stereotype.Service;
|
||||
import org.springframework.transaction.annotation.Transactional;
|
||||
|
||||
import java.awt.Rectangle;
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Path;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
/**
|
||||
* Parses a PDF into page-level SectionEntity records stored in Postgres.
|
||||
* Uses column-aware extraction via PDFTextStripperByArea: for two-column pages,
|
||||
* left column is extracted first then right, preserving correct reading order.
|
||||
* Text is also normalized (collapsed whitespace) before storage.
|
||||
*/
|
||||
@Service
|
||||
public class PdfStructureParser {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(PdfStructureParser.class);
|
||||
|
||||
// Right column is considered empty (single-column page) if it has < 20% of left column's content
|
||||
private static final double TWO_COLUMN_THRESHOLD = 0.2;
|
||||
|
||||
private final ChapterRepository chapterRepository;
|
||||
private final SectionRepository sectionRepository;
|
||||
|
||||
public PdfStructureParser(ChapterRepository chapterRepository,
|
||||
SectionRepository sectionRepository) {
|
||||
this.chapterRepository = chapterRepository;
|
||||
this.sectionRepository = sectionRepository;
|
||||
}
|
||||
|
||||
@Transactional
|
||||
public List<SectionEntity> parse(UUID bookId, String bookTitle, Path pdfPath) {
|
||||
log.info("Parsing PDF structure for book {}", bookId);
|
||||
|
||||
String chapterId = bookId + "-ch1";
|
||||
ChapterEntity chapter = new ChapterEntity(chapterId, bookId, 1, bookTitle, 1);
|
||||
chapterRepository.save(chapter);
|
||||
|
||||
List<SectionEntity> sections = new ArrayList<>();
|
||||
|
||||
try (PDDocument doc = Loader.loadPDF(pdfPath.toFile())) {
|
||||
List<PDPage> pages = new ArrayList<>();
|
||||
doc.getPages().forEach(pages::add);
|
||||
|
||||
for (int i = 0; i < 25; i++) {
|
||||
int pageNum = i + 1;
|
||||
String text = normalizeWhitespace(extractPageText(pages.get(i)));
|
||||
if (text.isBlank()) continue;
|
||||
|
||||
String sectionId = bookId + "-p" + pageNum;
|
||||
SectionEntity section = new SectionEntity(
|
||||
sectionId, chapterId, bookId,
|
||||
String.valueOf(pageNum),
|
||||
"Page " + pageNum,
|
||||
pageNum, pageNum,
|
||||
text
|
||||
);
|
||||
sections.add(sectionRepository.save(section));
|
||||
}
|
||||
} catch (IOException e) {
|
||||
throw new RuntimeException("Failed to parse PDF for book " + bookId, e);
|
||||
}
|
||||
|
||||
log.info("Parsed {} sections for book {}", sections.size(), bookId);
|
||||
return sections;
|
||||
}
|
||||
|
||||
/**
|
||||
* Extracts text from a single page using column-aware region extraction.
|
||||
* Splits the page at the horizontal midpoint. If the right region has fewer
|
||||
* than 20% of the characters of the left region, treats the page as single-column.
|
||||
*/
|
||||
private String extractPageText(PDPage page) throws IOException {
|
||||
PDRectangle mediaBox = page.getMediaBox();
|
||||
int width = (int) mediaBox.getWidth();
|
||||
int height = (int) mediaBox.getHeight();
|
||||
int mid = width / 2;
|
||||
|
||||
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
|
||||
stripper.setSortByPosition(true);
|
||||
stripper.addRegion("left", new Rectangle(0, 0, mid, height));
|
||||
stripper.addRegion("right", new Rectangle(mid, 0, width - mid, height));
|
||||
stripper.extractRegions(page);
|
||||
|
||||
String left = stripper.getTextForRegion("left").strip();
|
||||
String right = stripper.getTextForRegion("right").strip();
|
||||
|
||||
if (right.length() < left.length() * TWO_COLUMN_THRESHOLD) {
|
||||
// Single-column page — left holds all (or nearly all) content
|
||||
return left.isEmpty() ? right : left;
|
||||
}
|
||||
return left + "\n\n" + right;
|
||||
}
|
||||
|
||||
/** Collapses multi-space/tab runs and excessive blank lines. */
|
||||
private String normalizeWhitespace(String text) {
|
||||
return text
|
||||
.replaceAll("[ \t]{2,}", " ")
|
||||
.replaceAll("\n{3,}", "\n\n")
|
||||
.trim();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,97 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.stereotype.Service;
|
||||
import software.amazon.awssdk.auth.credentials.AwsBasicCredentials;
|
||||
import software.amazon.awssdk.auth.credentials.StaticCredentialsProvider;
|
||||
import software.amazon.awssdk.core.sync.RequestBody;
|
||||
import software.amazon.awssdk.regions.Region;
|
||||
import software.amazon.awssdk.services.s3.S3Client;
|
||||
import software.amazon.awssdk.services.s3.S3Configuration;
|
||||
import software.amazon.awssdk.services.s3.model.*;
|
||||
|
||||
import java.net.URI;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
@Service
|
||||
public class S3MarkdownStorageService implements MarkdownStorageService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(S3MarkdownStorageService.class);
|
||||
|
||||
private final S3Client s3;
|
||||
private final String bucket;
|
||||
|
||||
public S3MarkdownStorageService(
|
||||
@Value("${app.figure-storage.endpoint}") String endpoint,
|
||||
@Value("${app.figure-storage.region}") String region,
|
||||
@Value("${app.figure-storage.bucket}") String bucket,
|
||||
@Value("${app.figure-storage.access-key-id}") String accessKeyId,
|
||||
@Value("${app.figure-storage.secret-access-key}") String secretKey) {
|
||||
this.bucket = bucket;
|
||||
URI endpointUri = URI.create(endpoint);
|
||||
StaticCredentialsProvider credentials = StaticCredentialsProvider.create(
|
||||
AwsBasicCredentials.create(accessKeyId, secretKey));
|
||||
Region awsRegion = Region.of(region);
|
||||
S3Configuration s3Config = S3Configuration.builder().pathStyleAccessEnabled(true).build();
|
||||
|
||||
this.s3 = S3Client.builder()
|
||||
.endpointOverride(endpointUri)
|
||||
.region(awsRegion)
|
||||
.credentialsProvider(credentials)
|
||||
.serviceConfiguration(s3Config)
|
||||
.build();
|
||||
}
|
||||
|
||||
@Override
|
||||
public String save(UUID bookId, int pageNumber, String markdown) {
|
||||
String key = key(bookId, pageNumber);
|
||||
byte[] bytes = markdown.getBytes(StandardCharsets.UTF_8);
|
||||
s3.putObject(
|
||||
PutObjectRequest.builder().bucket(bucket).key(key)
|
||||
.contentType("text/html; charset=utf-8")
|
||||
.contentLength((long) bytes.length).build(),
|
||||
RequestBody.fromBytes(bytes));
|
||||
return key;
|
||||
}
|
||||
|
||||
@Override
|
||||
public String getText(UUID bookId, int pageNumber) {
|
||||
byte[] bytes = s3.getObjectAsBytes(
|
||||
GetObjectRequest.builder().bucket(bucket).key(key(bookId, pageNumber)).build()
|
||||
).asByteArray();
|
||||
return new String(bytes, StandardCharsets.UTF_8);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void deleteAll(UUID bookId) {
|
||||
String prefix = "html/" + bookId + "/";
|
||||
try {
|
||||
List<ObjectIdentifier> toDelete = new ArrayList<>();
|
||||
s3.listObjectsV2Paginator(ListObjectsV2Request.builder()
|
||||
.bucket(bucket).prefix(prefix).build()).stream()
|
||||
.flatMap(page -> page.contents().stream())
|
||||
.map(S3Object::key)
|
||||
.map(k -> ObjectIdentifier.builder().key(k).build())
|
||||
.forEach(toDelete::add);
|
||||
|
||||
if (toDelete.isEmpty()) return;
|
||||
|
||||
s3.deleteObjects(DeleteObjectsRequest.builder()
|
||||
.bucket(bucket)
|
||||
.delete(Delete.builder().objects(toDelete).build())
|
||||
.build());
|
||||
log.info("Deleted {} markdown files from S3 for book {}", toDelete.size(), bookId);
|
||||
} catch (S3Exception ex) {
|
||||
log.warn("Could not fully delete markdown for book {} from S3: {}", bookId, ex.getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
private static String key(UUID bookId, int pageNumber) {
|
||||
return "html/" + bookId + "/page-" + pageNumber + ".html";
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,63 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import jakarta.persistence.*;
|
||||
import java.time.Instant;
|
||||
import java.util.UUID;
|
||||
|
||||
@Entity
|
||||
@Table(name = "section")
|
||||
public class SectionEntity {
|
||||
|
||||
@Id
|
||||
@Column(name = "id", length = 200)
|
||||
private String id;
|
||||
|
||||
@Column(name = "chapter_id", nullable = false, length = 200)
|
||||
private String chapterId;
|
||||
|
||||
@Column(name = "book_id", nullable = false)
|
||||
private UUID bookId;
|
||||
|
||||
@Column(name = "number", length = 50)
|
||||
private String number;
|
||||
|
||||
@Column(name = "title", length = 500)
|
||||
private String title;
|
||||
|
||||
@Column(name = "page_start", nullable = false)
|
||||
private int pageStart;
|
||||
|
||||
@Column(name = "page_end", nullable = false)
|
||||
private int pageEnd;
|
||||
|
||||
@Column(name = "full_text", nullable = false, columnDefinition = "TEXT")
|
||||
private String fullText;
|
||||
|
||||
@Column(name = "created_at", nullable = false)
|
||||
private Instant createdAt;
|
||||
|
||||
public SectionEntity() {}
|
||||
|
||||
public SectionEntity(String id, String chapterId, UUID bookId, String number,
|
||||
String title, int pageStart, int pageEnd, String fullText) {
|
||||
this.id = id;
|
||||
this.chapterId = chapterId;
|
||||
this.bookId = bookId;
|
||||
this.number = number;
|
||||
this.title = title;
|
||||
this.pageStart = pageStart;
|
||||
this.pageEnd = pageEnd;
|
||||
this.fullText = fullText;
|
||||
this.createdAt = Instant.now();
|
||||
}
|
||||
|
||||
public String getId() { return id; }
|
||||
public String getChapterId() { return chapterId; }
|
||||
public UUID getBookId() { return bookId; }
|
||||
public String getNumber() { return number; }
|
||||
public String getTitle() { return title; }
|
||||
public int getPageStart() { return pageStart; }
|
||||
public int getPageEnd() { return pageEnd; }
|
||||
public String getFullText() { return fullText; }
|
||||
public Instant getCreatedAt() { return createdAt; }
|
||||
}
|
||||
@@ -0,0 +1,19 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.springframework.data.jpa.repository.JpaRepository;
|
||||
import org.springframework.data.jpa.repository.Query;
|
||||
import org.springframework.data.repository.query.Param;
|
||||
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
public interface SectionRepository extends JpaRepository<SectionEntity, String> {
|
||||
List<SectionEntity> findAllByBookId(UUID bookId);
|
||||
void deleteAllByBookId(UUID bookId);
|
||||
|
||||
@Query("SELECT s FROM SectionEntity s WHERE s.bookId = :bookId AND s.pageStart <= :windowEnd AND s.pageEnd >= :windowStart ORDER BY s.pageStart")
|
||||
List<SectionEntity> findByBookIdAndPageOverlap(
|
||||
@Param("bookId") UUID bookId,
|
||||
@Param("windowStart") int windowStart,
|
||||
@Param("windowEnd") int windowEnd);
|
||||
}
|
||||
@@ -0,0 +1,103 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.springframework.ai.document.Document;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import java.util.ArrayList;
|
||||
import java.util.HashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.UUID;
|
||||
|
||||
/**
|
||||
* Splits a SectionEntity's full text into overlapping chunks for vector embedding.
|
||||
* Target size: ~1800 characters (~450 tokens); overlap: 200 characters.
|
||||
*/
|
||||
@Service
|
||||
public class TextChunkingService {
|
||||
|
||||
private static final int TARGET_CHARS = 1800;
|
||||
private static final int OVERLAP_CHARS = 200;
|
||||
|
||||
public List<Document> chunk(SectionEntity section, String bookTitle) {
|
||||
String text = section.getFullText();
|
||||
if (text == null || text.isBlank()) return List.of();
|
||||
|
||||
List<String> windows = split(text);
|
||||
List<Document> documents = new ArrayList<>();
|
||||
|
||||
for (int i = 0; i < windows.size(); i++) {
|
||||
String chunkId = UUID.randomUUID().toString();
|
||||
Map<String, Object> metadata = buildMetadata(section, bookTitle, i, windows.size(), chunkId);
|
||||
documents.add(new Document(chunkId, windows.get(i), metadata));
|
||||
}
|
||||
return documents;
|
||||
}
|
||||
|
||||
private List<String> split(String text) {
|
||||
List<String> windows = new ArrayList<>();
|
||||
int start = 0;
|
||||
while (start < text.length()) {
|
||||
int hardEnd = Math.min(start + TARGET_CHARS, text.length());
|
||||
if (hardEnd == text.length()) {
|
||||
String last = text.substring(start).strip();
|
||||
if (!last.isEmpty()) windows.add(last);
|
||||
break;
|
||||
}
|
||||
int splitAt = findSplitPoint(text, start, hardEnd);
|
||||
String chunk = text.substring(start, splitAt).strip();
|
||||
if (!chunk.isEmpty()) windows.add(chunk);
|
||||
// Overlap: back up from split point, align to a word start
|
||||
int overlapStart = Math.max(start + 1, splitAt - OVERLAP_CHARS);
|
||||
while (overlapStart < splitAt && text.charAt(overlapStart) != ' ') overlapStart++;
|
||||
start = overlapStart < splitAt ? overlapStart + 1 : splitAt;
|
||||
}
|
||||
return windows;
|
||||
}
|
||||
|
||||
/**
|
||||
* Finds the best split point at or before hardEnd, preferring (in order):
|
||||
* paragraph boundary, sentence boundary, word boundary, hard cut.
|
||||
*/
|
||||
private int findSplitPoint(String text, int start, int hardEnd) {
|
||||
int lookback = Math.min(400, (hardEnd - start) / 2);
|
||||
|
||||
// 1. Paragraph boundary
|
||||
int paraIdx = text.lastIndexOf("\n\n", hardEnd);
|
||||
if (paraIdx > hardEnd - lookback && paraIdx > start) return paraIdx + 2;
|
||||
|
||||
// 2. Sentence boundary (. ! ?) followed by space or newline
|
||||
for (int i = hardEnd - 1; i > hardEnd - lookback && i > start; i--) {
|
||||
char c = text.charAt(i);
|
||||
if ((c == '.' || c == '!' || c == '?') && i + 1 < text.length()) {
|
||||
char next = text.charAt(i + 1);
|
||||
if (next == ' ' || next == '\n') return i + 1;
|
||||
}
|
||||
}
|
||||
|
||||
// 3. Word boundary
|
||||
for (int i = hardEnd - 1; i > hardEnd - 100 && i > start; i--) {
|
||||
if (text.charAt(i) == ' ') return i + 1;
|
||||
}
|
||||
|
||||
// 4. Hard cut
|
||||
return hardEnd;
|
||||
}
|
||||
|
||||
private Map<String, Object> buildMetadata(SectionEntity section, String bookTitle,
|
||||
int index, int total, String chunkId) {
|
||||
Map<String, Object> m = new HashMap<>();
|
||||
m.put("type", "TEXT");
|
||||
m.put("book_id", section.getBookId().toString());
|
||||
m.put("book_title", bookTitle);
|
||||
m.put("chapter_id", section.getChapterId());
|
||||
m.put("section_id", section.getId());
|
||||
m.put("section_title", section.getTitle() != null ? section.getTitle() : "");
|
||||
m.put("page_start", section.getPageStart());
|
||||
m.put("page_end", section.getPageEnd());
|
||||
m.put("chunk_index", index);
|
||||
m.put("total_chunks", total);
|
||||
m.put("chunk_id", chunkId);
|
||||
return m;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,108 @@
|
||||
package com.aiteacher.document;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.ai.chat.client.ChatClient;
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.core.io.ByteArrayResource;
|
||||
import org.springframework.stereotype.Service;
|
||||
import org.springframework.util.MimeTypeUtils;
|
||||
|
||||
/**
|
||||
* Analyses an extracted figure image using the OpenAI vision model.
|
||||
*
|
||||
* <p>Returns an {@link ImageAnalysis} record containing:
|
||||
* <ul>
|
||||
* <li>{@code description} — 2-3 sentence clinical description of the image</li>
|
||||
* <li>{@code imageText} — all visible text, labels, and annotations copied verbatim
|
||||
* from the image (empty string when none present)</li>
|
||||
* </ul>
|
||||
*
|
||||
* <p>Both fields are stored: {@code description} drives the embedding; {@code imageText}
|
||||
* is added to chunk metadata so queries can match exact labels (e.g., "Circle of Willis").
|
||||
*/
|
||||
@Service
|
||||
public class VisionDescriptionService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(VisionDescriptionService.class);
|
||||
|
||||
private static final String PROMPT = """
|
||||
You are a neurosurgery educator analysing a medical image.
|
||||
Respond in EXACTLY this format — no other text, no markdown:
|
||||
DESCRIPTION: <2-3 sentence clinical description focusing on anatomical structures, surgical landmarks, and clinical significance>
|
||||
IMAGE_TEXT: <all visible text, labels, measurements, and annotations copied verbatim, comma-separated; write NONE if no text visible>
|
||||
""";
|
||||
|
||||
/** Minimum ms between vision API calls. Configurable via app.vision.min-interval-ms. */
|
||||
private final long minIntervalMs;
|
||||
private final ChatClient chatClient;
|
||||
private volatile long lastCallAt = 0;
|
||||
|
||||
public VisionDescriptionService(
|
||||
ChatClient chatClient,
|
||||
@Value("${app.vision.min-interval-ms:2000}") long minIntervalMs) {
|
||||
this.chatClient = chatClient;
|
||||
this.minIntervalMs = minIntervalMs;
|
||||
}
|
||||
|
||||
/**
|
||||
* Holds the structured output of a vision model call on one figure image.
|
||||
*
|
||||
* @param description clinical description of the image content
|
||||
* @param imageText verbatim text visible inside the image; empty string if none
|
||||
*/
|
||||
public record ImageAnalysis(String description, String imageText) {}
|
||||
|
||||
/**
|
||||
* Analyses the image bytes and returns an {@link ImageAnalysis}.
|
||||
* Falls back gracefully: if the vision call fails, the caption is used as description
|
||||
* and imageText is left empty.
|
||||
*
|
||||
* @param imageBytes PNG bytes of the extracted figure
|
||||
* @param captionFallback caption detected from surrounding text, may be null
|
||||
*/
|
||||
public ImageAnalysis analyze(byte[] imageBytes, String captionFallback) {
|
||||
throttle();
|
||||
try {
|
||||
String raw = chatClient.prompt()
|
||||
.user(u -> u
|
||||
.text(PROMPT)
|
||||
.media(MimeTypeUtils.IMAGE_PNG, new ByteArrayResource(imageBytes)))
|
||||
.call()
|
||||
.content();
|
||||
return parse(raw, captionFallback);
|
||||
} catch (Exception ex) {
|
||||
log.warn("Vision analysis failed: {} — using caption as fallback", ex.getMessage());
|
||||
return new ImageAnalysis(
|
||||
captionFallback != null ? captionFallback : "Figure",
|
||||
"");
|
||||
}
|
||||
}
|
||||
|
||||
private synchronized void throttle() {
|
||||
long now = System.currentTimeMillis();
|
||||
long wait = minIntervalMs - (now - lastCallAt);
|
||||
if (wait > 0) {
|
||||
try { Thread.sleep(wait); } catch (InterruptedException e) { Thread.currentThread().interrupt(); }
|
||||
}
|
||||
lastCallAt = System.currentTimeMillis();
|
||||
}
|
||||
|
||||
private ImageAnalysis parse(String raw, String captionFallback) {
|
||||
String description = captionFallback != null ? captionFallback : "Figure";
|
||||
String imageText = "";
|
||||
|
||||
if (raw != null) {
|
||||
for (String line : raw.split("\n")) {
|
||||
if (line.startsWith("DESCRIPTION:")) {
|
||||
String val = line.substring("DESCRIPTION:".length()).strip();
|
||||
if (!val.isEmpty()) description = val;
|
||||
} else if (line.startsWith("IMAGE_TEXT:")) {
|
||||
String val = line.substring("IMAGE_TEXT:".length()).strip();
|
||||
if (!val.isEmpty() && !"NONE".equalsIgnoreCase(val)) imageText = val;
|
||||
}
|
||||
}
|
||||
}
|
||||
return new ImageAnalysis(description, imageText);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,27 @@
|
||||
package com.aiteacher.figure;
|
||||
|
||||
import java.awt.image.BufferedImage;
|
||||
import java.util.UUID;
|
||||
|
||||
public interface FigureStorageService {
|
||||
|
||||
/**
|
||||
* Saves an extracted image to S3 and returns the object key stored in the database.
|
||||
*/
|
||||
String save(UUID bookId, String figureId, BufferedImage image);
|
||||
|
||||
/**
|
||||
* Downloads the image bytes for the given S3 object key.
|
||||
*/
|
||||
byte[] getBytes(String key);
|
||||
|
||||
/**
|
||||
* Returns a presigned GET URL valid for 1 hour for the given S3 object key.
|
||||
*/
|
||||
String presignedUrl(String key);
|
||||
|
||||
/**
|
||||
* Deletes all figure objects for the given book.
|
||||
*/
|
||||
void deleteAll(UUID bookId);
|
||||
}
|
||||
@@ -0,0 +1,132 @@
|
||||
package com.aiteacher.figure;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.beans.factory.annotation.Value;
|
||||
import org.springframework.stereotype.Service;
|
||||
import software.amazon.awssdk.auth.credentials.AwsBasicCredentials;
|
||||
import software.amazon.awssdk.auth.credentials.StaticCredentialsProvider;
|
||||
import software.amazon.awssdk.core.sync.RequestBody;
|
||||
import software.amazon.awssdk.regions.Region;
|
||||
import software.amazon.awssdk.services.s3.S3Client;
|
||||
import software.amazon.awssdk.services.s3.S3Configuration;
|
||||
import software.amazon.awssdk.services.s3.model.*;
|
||||
import software.amazon.awssdk.services.s3.presigner.S3Presigner;
|
||||
import software.amazon.awssdk.services.s3.presigner.model.GetObjectPresignRequest;
|
||||
import software.amazon.awssdk.services.s3.model.S3Object;
|
||||
|
||||
import javax.imageio.ImageIO;
|
||||
import java.awt.image.BufferedImage;
|
||||
import java.io.ByteArrayOutputStream;
|
||||
import java.io.IOException;
|
||||
import java.net.URI;
|
||||
import java.time.Duration;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
@Service
|
||||
public class S3FigureStorageService implements FigureStorageService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(S3FigureStorageService.class);
|
||||
|
||||
private final S3Client s3;
|
||||
private final S3Presigner presigner;
|
||||
private final String bucket;
|
||||
|
||||
public S3FigureStorageService(
|
||||
@Value("${app.figure-storage.endpoint}") String endpoint,
|
||||
@Value("${app.figure-storage.region}") String region,
|
||||
@Value("${app.figure-storage.bucket}") String bucket,
|
||||
@Value("${app.figure-storage.access-key-id}") String accessKeyId,
|
||||
@Value("${app.figure-storage.secret-access-key}") String secretKey) {
|
||||
this.bucket = bucket;
|
||||
URI endpointUri = URI.create(endpoint);
|
||||
StaticCredentialsProvider credentials = StaticCredentialsProvider.create(
|
||||
AwsBasicCredentials.create(accessKeyId, secretKey));
|
||||
Region awsRegion = Region.of(region);
|
||||
|
||||
S3Configuration s3Config = S3Configuration.builder()
|
||||
.pathStyleAccessEnabled(true)
|
||||
.build();
|
||||
|
||||
this.s3 = S3Client.builder()
|
||||
.endpointOverride(endpointUri)
|
||||
.region(awsRegion)
|
||||
.credentialsProvider(credentials)
|
||||
.serviceConfiguration(s3Config)
|
||||
.build();
|
||||
|
||||
this.presigner = S3Presigner.builder()
|
||||
.endpointOverride(endpointUri)
|
||||
.region(awsRegion)
|
||||
.credentialsProvider(credentials)
|
||||
.serviceConfiguration(s3Config)
|
||||
.build();
|
||||
}
|
||||
|
||||
@Override
|
||||
public String save(UUID bookId, String figureId, BufferedImage image) {
|
||||
String key = "figures/" + bookId + "/" + figureId + ".png";
|
||||
try {
|
||||
ByteArrayOutputStream out = new ByteArrayOutputStream();
|
||||
ImageIO.write(image, "PNG", out);
|
||||
byte[] bytes = out.toByteArray();
|
||||
|
||||
s3.putObject(
|
||||
PutObjectRequest.builder().bucket(bucket).key(key)
|
||||
.contentType("image/png").contentLength((long) bytes.length).build(),
|
||||
RequestBody.fromBytes(bytes));
|
||||
return key;
|
||||
} catch (IOException ex) {
|
||||
throw new RuntimeException("Failed to encode figure " + figureId, ex);
|
||||
} catch (S3Exception ex) {
|
||||
throw new RuntimeException("Failed to upload figure " + figureId + " to S3", ex);
|
||||
}
|
||||
}
|
||||
|
||||
@Override
|
||||
public byte[] getBytes(String key) {
|
||||
try {
|
||||
return s3.getObjectAsBytes(
|
||||
GetObjectRequest.builder().bucket(bucket).key(key).build()).asByteArray();
|
||||
} catch (S3Exception ex) {
|
||||
throw new RuntimeException("Failed to download figure from S3: " + key, ex);
|
||||
}
|
||||
}
|
||||
|
||||
@Override
|
||||
public String presignedUrl(String key) {
|
||||
GetObjectPresignRequest request = GetObjectPresignRequest.builder()
|
||||
.signatureDuration(Duration.ofHours(1))
|
||||
.getObjectRequest(r -> r.bucket(bucket).key(key))
|
||||
.build();
|
||||
return presigner.presignGetObject(request).url().toString();
|
||||
}
|
||||
|
||||
@Override
|
||||
public void deleteAll(UUID bookId) {
|
||||
String prefix = "figures/" + bookId + "/";
|
||||
try {
|
||||
List<ObjectIdentifier> toDelete = new ArrayList<>();
|
||||
ListObjectsV2Request listRequest = ListObjectsV2Request.builder()
|
||||
.bucket(bucket).prefix(prefix).build();
|
||||
|
||||
s3.listObjectsV2Paginator(listRequest).stream()
|
||||
.flatMap(page -> page.contents().stream())
|
||||
.map(S3Object::key)
|
||||
.map(k -> ObjectIdentifier.builder().key(k).build())
|
||||
.forEach(toDelete::add);
|
||||
|
||||
if (toDelete.isEmpty()) return;
|
||||
|
||||
s3.deleteObjects(DeleteObjectsRequest.builder()
|
||||
.bucket(bucket)
|
||||
.delete(Delete.builder().objects(toDelete).build())
|
||||
.build());
|
||||
log.info("Deleted {} figures from S3 for book {}", toDelete.size(), bookId);
|
||||
} catch (S3Exception ex) {
|
||||
log.warn("Could not fully delete figures for book {} from S3: {}", bookId, ex.getMessage());
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,59 @@
|
||||
package com.aiteacher.retrieval;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.Set;
|
||||
import java.util.regex.Matcher;
|
||||
import java.util.regex.Pattern;
|
||||
|
||||
/**
|
||||
* Post-processes generated answers to strip citation labels that do not
|
||||
* correspond to any passage retrieved for the current query, preventing
|
||||
* hallucinated source references from reaching the user.
|
||||
*/
|
||||
@Service
|
||||
public class CitationValidatorService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(CitationValidatorService.class);
|
||||
|
||||
/** Matches citation labels of the form [S1], [F2], [S12], etc. */
|
||||
private static final Pattern CITATION_PATTERN = Pattern.compile("\\[(S|F)\\d+\\]");
|
||||
|
||||
/**
|
||||
* Removes any {@code [Sx]} / {@code [Fx]} citation in {@code generatedAnswer}
|
||||
* whose label is not contained in {@code validLabels}.
|
||||
*
|
||||
* @param generatedAnswer raw model output
|
||||
* @param validLabels set of labels present in the retrieved context
|
||||
* @return cleaned answer text with hallucinated citations removed
|
||||
*/
|
||||
public String validate(String generatedAnswer, Set<String> validLabels) {
|
||||
if (generatedAnswer == null) return "";
|
||||
|
||||
Matcher matcher = CITATION_PATTERN.matcher(generatedAnswer);
|
||||
List<String> removed = new ArrayList<>();
|
||||
StringBuffer sb = new StringBuffer();
|
||||
|
||||
while (matcher.find()) {
|
||||
String label = matcher.group();
|
||||
String inner = label.substring(1, label.length() - 1); // strip [ ]
|
||||
if (validLabels.contains(inner)) {
|
||||
matcher.appendReplacement(sb, Matcher.quoteReplacement(label));
|
||||
} else {
|
||||
removed.add(inner);
|
||||
matcher.appendReplacement(sb, "");
|
||||
}
|
||||
}
|
||||
matcher.appendTail(sb);
|
||||
|
||||
if (!removed.isEmpty()) {
|
||||
log.warn("Stripped hallucinated citations: {}", removed);
|
||||
}
|
||||
|
||||
return sb.toString();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,7 @@
|
||||
package com.aiteacher.retrieval;
|
||||
|
||||
/**
|
||||
* Value object holding the original user query alongside its clinically
|
||||
* rewritten variant used for vector-store retrieval.
|
||||
*/
|
||||
public record ExpandedQuery(String original, String rewritten) {}
|
||||
@@ -0,0 +1,27 @@
|
||||
package com.aiteacher.retrieval;
|
||||
|
||||
import com.aiteacher.document.FigureEntity;
|
||||
import com.aiteacher.document.SectionEntity;
|
||||
|
||||
import java.util.HashSet;
|
||||
import java.util.Map;
|
||||
import java.util.Set;
|
||||
|
||||
/**
|
||||
* Value object produced when building the LLM context prompt.
|
||||
* Maps short ref-labels (S1, S2… / F1, F2…) to their source entities
|
||||
* and carries the fully formatted prompt text.
|
||||
*/
|
||||
public record LabelledContext(
|
||||
Map<String, SectionEntity> sectionLabels,
|
||||
Map<String, FigureEntity> figureLabels,
|
||||
String promptText) {
|
||||
|
||||
/** Returns the union of all valid citation labels for this context. */
|
||||
public Set<String> allLabels() {
|
||||
Set<String> labels = new HashSet<>();
|
||||
labels.addAll(sectionLabels.keySet());
|
||||
labels.addAll(figureLabels.keySet());
|
||||
return labels;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,111 @@
|
||||
package com.aiteacher.retrieval;
|
||||
|
||||
import com.aiteacher.document.*;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.ai.document.Document;
|
||||
import org.springframework.ai.vectorstore.SearchRequest;
|
||||
import org.springframework.ai.vectorstore.VectorStore;
|
||||
import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import java.util.*;
|
||||
|
||||
/**
|
||||
* Dual-modality retriever: searches text chunks and figure captions independently,
|
||||
* then expands text hits to their parent sections and merges linked figures.
|
||||
*/
|
||||
@Service
|
||||
public class NeurosurgeryRetriever {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(NeurosurgeryRetriever.class);
|
||||
|
||||
private static final int TEXT_TOP_K = 5;
|
||||
private static final int FIGURE_TOP_K = 3;
|
||||
|
||||
private final VectorStore vectorStore;
|
||||
private final SectionRepository sectionRepository;
|
||||
private final FigureRepository figureRepository;
|
||||
private final ChunkFigureRefRepository chunkFigureRefRepository;
|
||||
|
||||
public NeurosurgeryRetriever(VectorStore vectorStore,
|
||||
SectionRepository sectionRepository,
|
||||
FigureRepository figureRepository,
|
||||
ChunkFigureRefRepository chunkFigureRefRepository) {
|
||||
this.vectorStore = vectorStore;
|
||||
this.sectionRepository = sectionRepository;
|
||||
this.figureRepository = figureRepository;
|
||||
this.chunkFigureRefRepository = chunkFigureRefRepository;
|
||||
}
|
||||
|
||||
public RetrievalResult retrieve(String query, UUID bookId) {
|
||||
FilterExpressionBuilder b = new FilterExpressionBuilder();
|
||||
|
||||
// 1. Text chunk search
|
||||
List<Document> textHits = vectorStore.similaritySearch(
|
||||
SearchRequest.builder()
|
||||
.query(query)
|
||||
.topK(TEXT_TOP_K)
|
||||
.filterExpression(b.and(
|
||||
b.eq("type", "TEXT"),
|
||||
b.eq("book_id", bookId.toString())
|
||||
).build())
|
||||
.build()
|
||||
);
|
||||
|
||||
// 2. Figure caption search (independent topK)
|
||||
List<Document> figureHits = vectorStore.similaritySearch(
|
||||
SearchRequest.builder()
|
||||
.query(query)
|
||||
.topK(FIGURE_TOP_K)
|
||||
.filterExpression(b.and(
|
||||
b.eq("type", "FIGURE"),
|
||||
b.eq("book_id", bookId.toString())
|
||||
).build())
|
||||
.build()
|
||||
);
|
||||
|
||||
// 3. Expand text chunks to parent sections from Postgres
|
||||
List<String> sectionIds = textHits.stream()
|
||||
.map(d -> (String) d.getMetadata().get("section_id"))
|
||||
.filter(Objects::nonNull)
|
||||
.distinct()
|
||||
.toList();
|
||||
List<SectionEntity> sections = sectionIds.isEmpty()
|
||||
? List.of()
|
||||
: sectionRepository.findAllById(sectionIds);
|
||||
|
||||
// 4. Fetch figures explicitly linked to retrieved chunks
|
||||
List<UUID> chunkIds = textHits.stream()
|
||||
.map(d -> {
|
||||
try { return UUID.fromString(d.getId()); }
|
||||
catch (Exception e) { return null; }
|
||||
})
|
||||
.filter(Objects::nonNull)
|
||||
.toList();
|
||||
List<String> linkedFigureIds = chunkIds.isEmpty()
|
||||
? List.of()
|
||||
: chunkFigureRefRepository.findByChunkIdIn(chunkIds)
|
||||
.stream().map(ChunkFigureRefEntity::getFigureId).distinct().toList();
|
||||
List<FigureEntity> linkedFigures = linkedFigureIds.isEmpty()
|
||||
? List.of()
|
||||
: figureRepository.findAllById(linkedFigureIds);
|
||||
|
||||
// 5. Collect figures from semantic figure search
|
||||
List<String> semanticFigureIds = figureHits.stream()
|
||||
.map(d -> (String) d.getMetadata().get("figure_id"))
|
||||
.filter(Objects::nonNull)
|
||||
.toList();
|
||||
List<FigureEntity> semanticFigures = semanticFigureIds.isEmpty()
|
||||
? List.of()
|
||||
: figureRepository.findAllById(semanticFigureIds);
|
||||
|
||||
// 6. Merge and deduplicate figures by figureId (linked figures take precedence)
|
||||
Map<String, FigureEntity> merged = new LinkedHashMap<>();
|
||||
linkedFigures.forEach(f -> merged.put(f.getId(), f));
|
||||
semanticFigures.forEach(f -> merged.putIfAbsent(f.getId(), f));
|
||||
|
||||
log.debug("Retrieved {} sections, {} figures for query", sections.size(), merged.size());
|
||||
return new RetrievalResult(sections, new ArrayList<>(merged.values()));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,47 @@
|
||||
package com.aiteacher.retrieval;
|
||||
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.ai.chat.client.ChatClient;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
/**
|
||||
* Rewrites a user query into precise clinical/surgical terminology so that
|
||||
* vector-store retrieval can match textbook language even when the user's
|
||||
* phrasing differs from the documentation vocabulary.
|
||||
*/
|
||||
@Service
|
||||
public class QueryExpansionService {
|
||||
|
||||
private static final Logger log = LoggerFactory.getLogger(QueryExpansionService.class);
|
||||
|
||||
private static final String EXPANSION_PROMPT = """
|
||||
Rewrite the following question using precise medical and surgical terminology \
|
||||
as it would appear in a neurosurgery textbook index. \
|
||||
Output only the rewritten question, nothing else.
|
||||
Question: %s""";
|
||||
|
||||
private final ChatClient chatClient;
|
||||
|
||||
public QueryExpansionService(ChatClient chatClient) {
|
||||
this.chatClient = chatClient;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns an {@link ExpandedQuery} whose {@code rewritten} field contains
|
||||
* the clinically rephrased version of {@code query}.
|
||||
*/
|
||||
public ExpandedQuery expand(String query) {
|
||||
String rewritten = chatClient.prompt()
|
||||
.user(EXPANSION_PROMPT.formatted(query))
|
||||
.call()
|
||||
.content();
|
||||
|
||||
if (rewritten == null || rewritten.isBlank()) {
|
||||
rewritten = query;
|
||||
}
|
||||
|
||||
log.debug("Query expanded: '{}' → '{}'", query, rewritten);
|
||||
return new ExpandedQuery(query, rewritten);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,11 @@
|
||||
package com.aiteacher.retrieval;
|
||||
|
||||
import com.aiteacher.document.FigureEntity;
|
||||
import com.aiteacher.document.SectionEntity;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
public record RetrievalResult(
|
||||
List<SectionEntity> parentSections,
|
||||
List<FigureEntity> figures
|
||||
) {}
|
||||
@@ -0,0 +1,7 @@
|
||||
package com.aiteacher.topic;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.UUID;
|
||||
|
||||
public record SavedSummaryItem(UUID id, int summaryNumber, Instant generatedAt) {
|
||||
}
|
||||
@@ -5,6 +5,7 @@ import org.springframework.web.bind.annotation.*;
|
||||
|
||||
import java.util.List;
|
||||
import java.util.NoSuchElementException;
|
||||
import java.util.UUID;
|
||||
|
||||
@RestController
|
||||
@RequestMapping("/api/v1/topics")
|
||||
@@ -32,4 +33,21 @@ public class TopicController {
|
||||
TopicSummaryResponse response = topicSummaryService.generateSummary(topic);
|
||||
return ResponseEntity.ok(response);
|
||||
}
|
||||
|
||||
@GetMapping("/{id}/summaries")
|
||||
public ResponseEntity<List<SavedSummaryItem>> listSummaries(@PathVariable String id) {
|
||||
topicRepository.findById(id)
|
||||
.orElseThrow(() -> new NoSuchElementException("Topic not found."));
|
||||
|
||||
return ResponseEntity.ok(topicSummaryService.listSummaries(id));
|
||||
}
|
||||
|
||||
@GetMapping("/{id}/summaries/{summaryId}")
|
||||
public ResponseEntity<TopicSummaryResponse> getSummary(@PathVariable String id,
|
||||
@PathVariable UUID summaryId) {
|
||||
topicRepository.findById(id)
|
||||
.orElseThrow(() -> new NoSuchElementException("Topic not found."));
|
||||
|
||||
return ResponseEntity.ok(topicSummaryService.getSummary(summaryId));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,53 @@
|
||||
package com.aiteacher.topic;
|
||||
|
||||
import jakarta.persistence.Column;
|
||||
import jakarta.persistence.Entity;
|
||||
import jakarta.persistence.GeneratedValue;
|
||||
import jakarta.persistence.GenerationType;
|
||||
import jakarta.persistence.Id;
|
||||
import jakarta.persistence.Table;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.UUID;
|
||||
|
||||
@Entity
|
||||
@Table(name = "topic_summary")
|
||||
public class TopicSummaryEntity {
|
||||
|
||||
@Id
|
||||
@GeneratedValue(strategy = GenerationType.UUID)
|
||||
private UUID id;
|
||||
|
||||
@Column(name = "topic_id", nullable = false)
|
||||
private String topicId;
|
||||
|
||||
@Column(name = "summary_number", nullable = false)
|
||||
private int summaryNumber;
|
||||
|
||||
@Column(nullable = false, columnDefinition = "TEXT")
|
||||
private String summary;
|
||||
|
||||
@Column(name = "sources_json", nullable = false, columnDefinition = "TEXT")
|
||||
private String sourcesJson;
|
||||
|
||||
@Column(name = "generated_at", nullable = false)
|
||||
private Instant generatedAt;
|
||||
|
||||
protected TopicSummaryEntity() {}
|
||||
|
||||
public TopicSummaryEntity(String topicId, int summaryNumber, String summary,
|
||||
String sourcesJson, Instant generatedAt) {
|
||||
this.topicId = topicId;
|
||||
this.summaryNumber = summaryNumber;
|
||||
this.summary = summary;
|
||||
this.sourcesJson = sourcesJson;
|
||||
this.generatedAt = generatedAt;
|
||||
}
|
||||
|
||||
public UUID getId() { return id; }
|
||||
public String getTopicId() { return topicId; }
|
||||
public int getSummaryNumber() { return summaryNumber; }
|
||||
public String getSummary() { return summary; }
|
||||
public String getSourcesJson() { return sourcesJson; }
|
||||
public Instant getGeneratedAt() { return generatedAt; }
|
||||
}
|
||||
@@ -0,0 +1,13 @@
|
||||
package com.aiteacher.topic;
|
||||
|
||||
import org.springframework.data.jpa.repository.JpaRepository;
|
||||
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
public interface TopicSummaryRepository extends JpaRepository<TopicSummaryEntity, UUID> {
|
||||
|
||||
List<TopicSummaryEntity> findByTopicIdOrderBySummaryNumberAsc(String topicId);
|
||||
|
||||
long countByTopicId(String topicId);
|
||||
}
|
||||
@@ -2,8 +2,11 @@ package com.aiteacher.topic;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
public record TopicSummaryResponse(
|
||||
UUID id,
|
||||
int summaryNumber,
|
||||
String topicId,
|
||||
String topicName,
|
||||
String summary,
|
||||
@@ -11,6 +14,7 @@ public record TopicSummaryResponse(
|
||||
Instant generatedAt
|
||||
) {
|
||||
public record SourceReference(
|
||||
String bookId,
|
||||
String bookTitle,
|
||||
Integer page
|
||||
) {
|
||||
|
||||
@@ -1,21 +1,25 @@
|
||||
package com.aiteacher.topic;
|
||||
|
||||
import com.aiteacher.book.Book;
|
||||
import com.aiteacher.book.BookRepository;
|
||||
import com.aiteacher.book.BookStatus;
|
||||
import com.aiteacher.book.NoKnowledgeSourceException;
|
||||
import com.aiteacher.document.FigureEntity;
|
||||
import com.aiteacher.document.SectionEntity;
|
||||
import com.aiteacher.retrieval.NeurosurgeryRetriever;
|
||||
import com.aiteacher.retrieval.RetrievalResult;
|
||||
import com.fasterxml.jackson.core.JsonProcessingException;
|
||||
import com.fasterxml.jackson.databind.ObjectMapper;
|
||||
import org.slf4j.Logger;
|
||||
import org.slf4j.LoggerFactory;
|
||||
import org.springframework.ai.chat.client.ChatClient;
|
||||
import org.springframework.ai.chat.client.advisor.vectorstore.QuestionAnswerAdvisor;
|
||||
import org.springframework.ai.chat.model.ChatResponse;
|
||||
import org.springframework.ai.document.Document;
|
||||
import org.springframework.ai.vectorstore.VectorStore;
|
||||
import org.springframework.stereotype.Service;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.NoSuchElementException;
|
||||
import java.util.UUID;
|
||||
|
||||
@Service
|
||||
public class TopicSummaryService {
|
||||
@@ -29,80 +33,190 @@ public class TopicSummaryService {
|
||||
|
||||
When answering:
|
||||
- Structure your response clearly with key points
|
||||
- If the context mentions specific book titles and page numbers, reference them
|
||||
- Cite claims using ONLY the reference labels provided in the context (e.g. [S1], [F2]).
|
||||
Do not invent page numbers, section titles, or labels not present in the CONTEXT block.
|
||||
- If the retrieved context does not contain sufficient information on the topic,
|
||||
explicitly state: "The uploaded books do not contain sufficient information on this topic."
|
||||
- Never hallucinate or fabricate clinical information
|
||||
""";
|
||||
|
||||
private final ChatClient chatClient;
|
||||
private final VectorStore vectorStore;
|
||||
private final BookRepository bookRepository;
|
||||
private final NeurosurgeryRetriever retriever;
|
||||
private final TopicSummaryRepository summaryRepository;
|
||||
private final ObjectMapper objectMapper;
|
||||
|
||||
public TopicSummaryService(ChatClient chatClient, VectorStore vectorStore,
|
||||
BookRepository bookRepository) {
|
||||
public TopicSummaryService(ChatClient chatClient,
|
||||
BookRepository bookRepository,
|
||||
NeurosurgeryRetriever retriever,
|
||||
TopicSummaryRepository summaryRepository,
|
||||
ObjectMapper objectMapper) {
|
||||
this.chatClient = chatClient;
|
||||
this.vectorStore = vectorStore;
|
||||
this.bookRepository = bookRepository;
|
||||
this.retriever = retriever;
|
||||
this.summaryRepository = summaryRepository;
|
||||
this.objectMapper = objectMapper;
|
||||
}
|
||||
|
||||
public TopicSummaryResponse generateSummary(Topic topic) {
|
||||
if (!bookRepository.existsByStatus(BookStatus.READY)) {
|
||||
List<Book> readyBooks = bookRepository.findAll().stream()
|
||||
.filter(b -> b.getStatus() == BookStatus.READY)
|
||||
.toList();
|
||||
|
||||
if (readyBooks.isEmpty()) {
|
||||
throw new NoKnowledgeSourceException(
|
||||
"No books are available as knowledge sources. Please upload and process at least one book.");
|
||||
}
|
||||
|
||||
String question = buildQuestion(topic);
|
||||
|
||||
ChatResponse response = chatClient.prompt()
|
||||
.system(SYSTEM_PROMPT)
|
||||
.advisors(QuestionAnswerAdvisor.builder(vectorStore).build())
|
||||
.user(question)
|
||||
.call()
|
||||
.chatResponse();
|
||||
List<SectionEntity> allSections = new ArrayList<>();
|
||||
List<FigureEntity> allFigures = new ArrayList<>();
|
||||
for (Book book : readyBooks) {
|
||||
RetrievalResult result = retriever.retrieve(question, book.getId());
|
||||
allSections.addAll(result.parentSections());
|
||||
allFigures.addAll(result.figures());
|
||||
}
|
||||
|
||||
String summary = response.getResult().getOutput().getText();
|
||||
List<TopicSummaryResponse.SourceReference> sources = extractSources(response);
|
||||
log.debug("Topic summary for '{}': {} sections, {} figures retrieved",
|
||||
topic.getName(), allSections.size(), allFigures.size());
|
||||
|
||||
String contextPrompt = buildContextPrompt(question, allSections, allFigures);
|
||||
String summary = chatClient.prompt()
|
||||
.system(SYSTEM_PROMPT)
|
||||
.user(contextPrompt)
|
||||
.call()
|
||||
.content();
|
||||
|
||||
List<TopicSummaryResponse.SourceReference> sources = buildSources(allSections, allFigures, readyBooks);
|
||||
Instant generatedAt = Instant.now();
|
||||
|
||||
int summaryNumber = (int) summaryRepository.countByTopicId(topic.getId()) + 1;
|
||||
String sourcesJson = serializeSources(sources);
|
||||
TopicSummaryEntity entity = new TopicSummaryEntity(
|
||||
topic.getId(), summaryNumber, summary, sourcesJson, generatedAt);
|
||||
entity = summaryRepository.save(entity);
|
||||
|
||||
return new TopicSummaryResponse(
|
||||
entity.getId(),
|
||||
summaryNumber,
|
||||
topic.getId(),
|
||||
topic.getName(),
|
||||
summary,
|
||||
sources,
|
||||
Instant.now()
|
||||
generatedAt
|
||||
);
|
||||
}
|
||||
|
||||
public List<SavedSummaryItem> listSummaries(String topicId) {
|
||||
return summaryRepository.findByTopicIdOrderBySummaryNumberAsc(topicId).stream()
|
||||
.map(e -> new SavedSummaryItem(e.getId(), e.getSummaryNumber(), e.getGeneratedAt()))
|
||||
.toList();
|
||||
}
|
||||
|
||||
public TopicSummaryResponse getSummary(UUID summaryId) {
|
||||
TopicSummaryEntity entity = summaryRepository.findById(summaryId)
|
||||
.orElseThrow(() -> new NoSuchElementException("Summary not found."));
|
||||
|
||||
List<TopicSummaryResponse.SourceReference> sources = deserializeSources(entity.getSourcesJson());
|
||||
|
||||
return new TopicSummaryResponse(
|
||||
entity.getId(),
|
||||
entity.getSummaryNumber(),
|
||||
entity.getTopicId(),
|
||||
entity.getTopicId(),
|
||||
entity.getSummary(),
|
||||
sources,
|
||||
entity.getGeneratedAt()
|
||||
);
|
||||
}
|
||||
|
||||
private String buildQuestion(Topic topic) {
|
||||
return String.format(
|
||||
"Please provide a comprehensive educational summary of the following neurosurgery topic: " +
|
||||
"Provide a comprehensive educational summary of the following neurosurgery topic: " +
|
||||
"%s. Topic description: %s. " +
|
||||
"Include key concepts, clinical considerations, and important details that a neurosurgeon should know.",
|
||||
topic.getName(), topic.getDescription()
|
||||
);
|
||||
}
|
||||
|
||||
private List<TopicSummaryResponse.SourceReference> extractSources(ChatResponse response) {
|
||||
private String buildContextPrompt(String question,
|
||||
List<SectionEntity> sections,
|
||||
List<FigureEntity> figures) {
|
||||
StringBuilder sb = new StringBuilder();
|
||||
|
||||
if (!sections.isEmpty()) {
|
||||
sb.append("CONTEXT:\n\n");
|
||||
for (int i = 0; i < sections.size(); i++) {
|
||||
SectionEntity s = sections.get(i);
|
||||
sb.append("[S").append(i + 1).append("] ")
|
||||
.append(s.getTitle()).append(", p.").append(s.getPageStart()).append("\n");
|
||||
sb.append(s.getFullText()).append("\n\n");
|
||||
}
|
||||
}
|
||||
|
||||
if (!figures.isEmpty()) {
|
||||
sb.append("AVAILABLE FIGURES:\n");
|
||||
for (int i = 0; i < figures.size(); i++) {
|
||||
FigureEntity f = figures.get(i);
|
||||
sb.append("[F").append(i + 1).append("] ")
|
||||
.append(f.getLabel() != null ? f.getLabel() : "Figure")
|
||||
.append(" (p.").append(f.getPage()).append("): ")
|
||||
.append(f.getCaption() != null ? f.getCaption() : "")
|
||||
.append("\n");
|
||||
}
|
||||
sb.append("\n");
|
||||
}
|
||||
|
||||
sb.append("QUESTION:\n").append(question);
|
||||
return sb.toString();
|
||||
}
|
||||
|
||||
private List<TopicSummaryResponse.SourceReference> buildSources(List<SectionEntity> sections,
|
||||
List<FigureEntity> figures,
|
||||
List<Book> readyBooks) {
|
||||
List<TopicSummaryResponse.SourceReference> sources = new ArrayList<>();
|
||||
|
||||
if (response.getMetadata() != null) {
|
||||
Object retrieved = response.getMetadata().get(QuestionAnswerAdvisor.RETRIEVED_DOCUMENTS);
|
||||
if (retrieved instanceof List<?> docs) {
|
||||
for (Object docObj : docs) {
|
||||
if (docObj instanceof Document doc) {
|
||||
Map<String, Object> metadata = doc.getMetadata();
|
||||
String bookTitle = (String) metadata.get("book_title");
|
||||
Object pageObj = metadata.get("page_number");
|
||||
Integer page = pageObj instanceof Number n ? n.intValue() : null;
|
||||
if (bookTitle != null) {
|
||||
sources.add(new TopicSummaryResponse.SourceReference(bookTitle, page));
|
||||
for (SectionEntity s : sections) {
|
||||
Book book = readyBooks.stream()
|
||||
.filter(b -> b.getId().equals(s.getBookId()))
|
||||
.findFirst()
|
||||
.orElse(null);
|
||||
String title = book != null ? book.getTitle() : "Book";
|
||||
String bookId = book != null ? book.getId().toString() : null;
|
||||
sources.add(new TopicSummaryResponse.SourceReference(bookId, title, s.getPageStart()));
|
||||
}
|
||||
|
||||
for (FigureEntity f : figures) {
|
||||
Book book = readyBooks.stream()
|
||||
.filter(b -> b.getId().equals(f.getBookId()))
|
||||
.findFirst()
|
||||
.orElse(null);
|
||||
String title = book != null ? book.getTitle() : "Book";
|
||||
String bookId = book != null ? book.getId().toString() : null;
|
||||
sources.add(new TopicSummaryResponse.SourceReference(bookId, title, f.getPage()));
|
||||
}
|
||||
|
||||
return sources.stream().distinct().toList();
|
||||
}
|
||||
|
||||
private String serializeSources(List<TopicSummaryResponse.SourceReference> sources) {
|
||||
try {
|
||||
return objectMapper.writeValueAsString(sources);
|
||||
} catch (JsonProcessingException e) {
|
||||
log.warn("Failed to serialize sources, storing empty array", e);
|
||||
return "[]";
|
||||
}
|
||||
}
|
||||
|
||||
// Deduplicate by bookTitle + page
|
||||
return sources.stream().distinct().toList();
|
||||
private List<TopicSummaryResponse.SourceReference> deserializeSources(String json) {
|
||||
try {
|
||||
return objectMapper.readValue(json,
|
||||
objectMapper.getTypeFactory().constructCollectionType(
|
||||
List.class, TopicSummaryResponse.SourceReference.class));
|
||||
} catch (JsonProcessingException e) {
|
||||
log.warn("Failed to deserialize sources from stored JSON", e);
|
||||
return List.of();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -27,10 +27,10 @@ spring:
|
||||
index-type: HNSW
|
||||
initialize-schema: false
|
||||
openai:
|
||||
api-key: ${OPENAI_API_KEY}
|
||||
api-key: ${OPENAI_API_KEY:}
|
||||
chat:
|
||||
options:
|
||||
model: gpt-4o
|
||||
model: gpt-4o-mini
|
||||
embedding:
|
||||
options:
|
||||
model: "text-embedding-3-small"
|
||||
@@ -47,6 +47,29 @@ spring:
|
||||
max-size: 8
|
||||
queue-capacity: 50
|
||||
|
||||
logging:
|
||||
level:
|
||||
"[org.apache.pdfbox]": ERROR
|
||||
|
||||
app:
|
||||
features:
|
||||
upload-enabled: ${UPLOAD_ENABLED:true}
|
||||
delete-enabled: ${DELETE_ENABLED:true}
|
||||
auth:
|
||||
username: ${APP_AUTH_USERNAME:neurosurgeon}
|
||||
password: ${APP_PASSWORD:changeme}
|
||||
figure-storage:
|
||||
endpoint: ${S3_ENDPOINT:https://s3.immich-ad.ovh}
|
||||
region: ${S3_REGION:garage}
|
||||
bucket: ${S3_BUCKET:aiteacher}
|
||||
access-key-id: ${S3_ACCESS_KEY_ID:}
|
||||
secret-access-key: ${S3_SECRET_ACCESS_KEY:}
|
||||
min-image-size-px: 100
|
||||
embedding:
|
||||
batch-size: 20
|
||||
batch-delay-ms: 2000
|
||||
skip-embedding: false
|
||||
marker:
|
||||
base-url: ${MARKER_BASE_URL:http://192.168.1.105:8000}
|
||||
vision:
|
||||
min-interval-ms: ${VISION_MIN_INTERVAL_MS:2000}
|
||||
|
||||
@@ -0,0 +1,28 @@
|
||||
-- ============================================================
|
||||
-- V4: Document hierarchy — chapter and section tables
|
||||
-- Supports parent-child retrieval pattern for RAG precision.
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS chapter (
|
||||
id VARCHAR(200) PRIMARY KEY,
|
||||
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||
number INT NOT NULL DEFAULT 1,
|
||||
title VARCHAR(500),
|
||||
page_start INT,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS section (
|
||||
id VARCHAR(200) PRIMARY KEY,
|
||||
chapter_id VARCHAR(200) NOT NULL REFERENCES chapter(id) ON DELETE CASCADE,
|
||||
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||
number VARCHAR(50),
|
||||
title VARCHAR(500),
|
||||
page_start INT NOT NULL,
|
||||
page_end INT NOT NULL,
|
||||
full_text TEXT NOT NULL,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_section_book ON section(book_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_section_chapter ON section(chapter_id);
|
||||
@@ -0,0 +1,29 @@
|
||||
-- ============================================================
|
||||
-- V5: Figures and chunk-to-figure reference table
|
||||
-- figure: metadata + file path for each extracted image
|
||||
-- chunk_figure_ref: links vector-store chunks to figures
|
||||
-- ============================================================
|
||||
|
||||
CREATE TABLE IF NOT EXISTS figure (
|
||||
id VARCHAR(200) PRIMARY KEY,
|
||||
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||
section_id VARCHAR(200) REFERENCES section(id) ON DELETE SET NULL,
|
||||
chapter_id VARCHAR(200) REFERENCES chapter(id) ON DELETE SET NULL,
|
||||
label VARCHAR(100),
|
||||
caption TEXT,
|
||||
figure_type VARCHAR(50) NOT NULL,
|
||||
page INT NOT NULL,
|
||||
image_path VARCHAR(1000) NOT NULL,
|
||||
caption_embedding_id UUID,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS chunk_figure_ref (
|
||||
chunk_id UUID NOT NULL,
|
||||
figure_id VARCHAR(200) NOT NULL REFERENCES figure(id) ON DELETE CASCADE,
|
||||
mention_page INT,
|
||||
PRIMARY KEY (chunk_id, figure_id)
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_figure_book ON figure(book_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_cfr_chunk ON chunk_figure_ref(chunk_id);
|
||||
@@ -0,0 +1,10 @@
|
||||
CREATE TABLE topic_summary (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
topic_id VARCHAR(100) NOT NULL,
|
||||
summary_number INT NOT NULL,
|
||||
summary TEXT NOT NULL,
|
||||
sources_json TEXT NOT NULL,
|
||||
generated_at TIMESTAMPTZ NOT NULL
|
||||
);
|
||||
|
||||
CREATE INDEX idx_topic_summary_topic_id ON topic_summary(topic_id, summary_number);
|
||||
@@ -0,0 +1,37 @@
|
||||
version: '3.9'
|
||||
|
||||
services:
|
||||
postgres:
|
||||
image: pgvector/pgvector:pg16
|
||||
container_name: aiteacher-postgres-native
|
||||
environment:
|
||||
POSTGRES_DB: aiteacher
|
||||
POSTGRES_USER: aiteacher
|
||||
POSTGRES_PASSWORD: aiteacher
|
||||
ports:
|
||||
- "5432:5432"
|
||||
volumes:
|
||||
- pgdata_native:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U aiteacher -d aiteacher"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
backend:
|
||||
image: ai-teacher-backend:latest
|
||||
container_name: aiteacher-backend-native
|
||||
env_file:
|
||||
- .env
|
||||
environment:
|
||||
SPRING_DATASOURCE_URL: jdbc:postgresql://postgres:5432/aiteacher
|
||||
SPRING_DATASOURCE_USERNAME: aiteacher
|
||||
SPRING_DATASOURCE_PASSWORD: aiteacher
|
||||
ports:
|
||||
- "8080:8080"
|
||||
depends_on:
|
||||
postgres:
|
||||
condition: service_healthy
|
||||
|
||||
volumes:
|
||||
pgdata_native:
|
||||
@@ -3,5 +3,12 @@
|
||||
# In production point it directly at the backend, e.g. https://api.example.com/api/v1
|
||||
VITE_API_URL=/api/v1
|
||||
|
||||
# Shared password for HTTP Basic auth (must match APP_PASSWORD on the backend).
|
||||
VITE_APP_PASSWORD=changeme
|
||||
# Credentials are no longer configured here. Users enter their username and
|
||||
# password via the login form. The backend validates them via HTTP Basic Auth.
|
||||
# Configure the backend credentials with APP_AUTH_USERNAME and APP_PASSWORD.
|
||||
|
||||
# Set to 'false' to hide the upload UI (frontend). Also set UPLOAD_ENABLED=false on the backend to block the endpoint.
|
||||
VITE_UPLOAD_ENABLED=true
|
||||
|
||||
# Set to 'false' to hide the delete button (frontend). Also set DELETE_ENABLED=false on the backend to block the endpoint.
|
||||
VITE_DELETE_ENABLED=true
|
||||
|
||||
+5
-3
@@ -1,5 +1,5 @@
|
||||
# ---- Build stage ----
|
||||
FROM node:20-alpine AS build
|
||||
FROM docker.io/library/node:20-alpine AS build
|
||||
WORKDIR /app
|
||||
COPY package*.json ./
|
||||
RUN npm ci
|
||||
@@ -7,8 +7,10 @@ COPY . .
|
||||
RUN npm run build
|
||||
|
||||
# ---- Runtime stage (nginx) ----
|
||||
FROM nginx:alpine
|
||||
FROM docker.io/library/nginx:alpine
|
||||
COPY --from=build /app/dist /usr/share/nginx/html
|
||||
COPY nginx.conf /etc/nginx/conf.d/default.conf
|
||||
COPY docker-entrypoint.sh /docker-entrypoint.sh
|
||||
RUN chmod +x /docker-entrypoint.sh
|
||||
EXPOSE 80
|
||||
CMD ["nginx", "-g", "daemon off;"]
|
||||
ENTRYPOINT ["/docker-entrypoint.sh"]
|
||||
|
||||
@@ -0,0 +1,16 @@
|
||||
#!/bin/sh
|
||||
set -e
|
||||
|
||||
# Write runtime env vars into a JS file loaded before the app bundle.
|
||||
# Any VITE_* variable passed via `docker run -e` will be available as
|
||||
# window.__env__.VITE_* inside the browser.
|
||||
cat > /usr/share/nginx/html/env-config.js <<EOF
|
||||
window.__env__ = {
|
||||
VITE_API_URL: "${VITE_API_URL:-}",
|
||||
VITE_APP_PASSWORD: "${VITE_APP_PASSWORD:-}",
|
||||
VITE_UPLOAD_ENABLED: "${VITE_UPLOAD_ENABLED:-}",
|
||||
VITE_DELETE_ENABLED: "${VITE_DELETE_ENABLED:-}"
|
||||
};
|
||||
EOF
|
||||
|
||||
exec nginx -g "daemon off;"
|
||||
@@ -8,6 +8,7 @@
|
||||
</head>
|
||||
<body>
|
||||
<div id="app"></div>
|
||||
<script src="/env-config.js"></script>
|
||||
<script type="module" src="/src/main.ts"></script>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
+139
-4
@@ -6,6 +6,11 @@
|
||||
<span class="brand-name">AI Teacher</span>
|
||||
<span class="brand-subtitle">Neurosurgeon Learning Platform</span>
|
||||
</div>
|
||||
<template v-if="authStore.isAuthenticated">
|
||||
<button class="burger" :class="{ open: menuOpen }" @click="menuOpen = !menuOpen" aria-label="Menu">
|
||||
<span></span><span></span><span></span>
|
||||
</button>
|
||||
<div class="nav-drawer" :class="{ open: menuOpen }" @click="menuOpen = false">
|
||||
<ul class="navbar-links">
|
||||
<li>
|
||||
<RouterLink to="/" :class="{ active: $route.path === '/' }">
|
||||
@@ -23,6 +28,9 @@
|
||||
</RouterLink>
|
||||
</li>
|
||||
</ul>
|
||||
<button class="btn btn-logout" @click.stop="logout">Sign out</button>
|
||||
</div>
|
||||
</template>
|
||||
</nav>
|
||||
|
||||
<main class="main-content">
|
||||
@@ -35,12 +43,26 @@
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { ref, provide } from 'vue'
|
||||
import { RouterLink, RouterView } from 'vue-router'
|
||||
import { ref, provide, watch } from 'vue'
|
||||
import { RouterLink, RouterView, useRouter, useRoute } from 'vue-router'
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
|
||||
const authStore = useAuthStore()
|
||||
const router = useRouter()
|
||||
const route = useRoute()
|
||||
|
||||
const menuOpen = ref(false)
|
||||
const toastMessage = ref('')
|
||||
const toastType = ref<'toast-error' | 'toast-success'>('toast-error')
|
||||
|
||||
// Close menu on navigation
|
||||
watch(() => route.path, () => { menuOpen.value = false })
|
||||
|
||||
function logout() {
|
||||
authStore.clearCredentials()
|
||||
router.push({ name: 'login' })
|
||||
}
|
||||
|
||||
function showToast(message: string, type: 'error' | 'success' = 'error') {
|
||||
toastMessage.value = message
|
||||
toastType.value = type === 'error' ? 'toast-error' : 'toast-success'
|
||||
@@ -64,11 +86,11 @@ body {
|
||||
Ubuntu, Cantarell, 'Fira Sans', 'Droid Sans', 'Helvetica Neue', sans-serif;
|
||||
background: #f0f4f8;
|
||||
color: #2d3748;
|
||||
min-height: 100vh;
|
||||
height: 100vh;
|
||||
}
|
||||
|
||||
#app {
|
||||
min-height: 100vh;
|
||||
height: 100vh;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
}
|
||||
@@ -82,6 +104,9 @@ body {
|
||||
justify-content: space-between;
|
||||
height: 64px;
|
||||
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.3);
|
||||
position: sticky;
|
||||
top: 0;
|
||||
z-index: 100;
|
||||
}
|
||||
|
||||
.navbar-brand {
|
||||
@@ -106,6 +131,13 @@ body {
|
||||
margin-left: 0.25rem;
|
||||
}
|
||||
|
||||
/* Desktop: links inline */
|
||||
.nav-drawer {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
.navbar-links {
|
||||
list-style: none;
|
||||
display: flex;
|
||||
@@ -131,8 +163,38 @@ body {
|
||||
color: white;
|
||||
}
|
||||
|
||||
/* Burger button — hidden on desktop */
|
||||
.burger {
|
||||
display: none;
|
||||
flex-direction: column;
|
||||
justify-content: center;
|
||||
gap: 5px;
|
||||
width: 36px;
|
||||
height: 36px;
|
||||
background: transparent;
|
||||
border: none;
|
||||
cursor: pointer;
|
||||
padding: 4px;
|
||||
border-radius: 6px;
|
||||
}
|
||||
|
||||
.burger span {
|
||||
display: block;
|
||||
height: 2px;
|
||||
background: #bee3f8;
|
||||
border-radius: 2px;
|
||||
transition: transform 0.2s, opacity 0.2s;
|
||||
}
|
||||
|
||||
.burger.open span:nth-child(1) { transform: translateY(7px) rotate(45deg); }
|
||||
.burger.open span:nth-child(2) { opacity: 0; }
|
||||
.burger.open span:nth-child(3) { transform: translateY(-7px) rotate(-45deg); }
|
||||
|
||||
.main-content {
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
padding: 2rem;
|
||||
max-width: 1200px;
|
||||
margin: 0 auto;
|
||||
@@ -224,6 +286,20 @@ body {
|
||||
background: #cbd5e0;
|
||||
}
|
||||
|
||||
.btn-logout {
|
||||
background: transparent;
|
||||
color: #bee3f8;
|
||||
border: 1px solid #4a90b8;
|
||||
font-size: 0.85rem;
|
||||
padding: 0.4rem 0.9rem;
|
||||
margin-left: 1rem;
|
||||
}
|
||||
|
||||
.btn-logout:hover {
|
||||
background: #2b6cb0;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.spinner {
|
||||
display: inline-block;
|
||||
width: 20px;
|
||||
@@ -284,4 +360,63 @@ body {
|
||||
font-size: 0.9rem;
|
||||
margin-top: 0.5rem;
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.navbar {
|
||||
padding: 0 1rem;
|
||||
}
|
||||
|
||||
.brand-subtitle {
|
||||
display: none;
|
||||
}
|
||||
|
||||
/* Show burger, hide desktop drawer */
|
||||
.burger {
|
||||
display: flex;
|
||||
}
|
||||
|
||||
.nav-drawer {
|
||||
display: none;
|
||||
position: absolute;
|
||||
top: 64px;
|
||||
right: 0;
|
||||
left: 0;
|
||||
background: #1a365d;
|
||||
flex-direction: column;
|
||||
align-items: stretch;
|
||||
padding: 0.5rem 0 1rem;
|
||||
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.3);
|
||||
z-index: 99;
|
||||
}
|
||||
|
||||
.nav-drawer.open {
|
||||
display: flex;
|
||||
}
|
||||
|
||||
.navbar-links {
|
||||
flex-direction: column;
|
||||
gap: 0;
|
||||
}
|
||||
|
||||
.navbar-links a {
|
||||
padding: 0.85rem 1.5rem;
|
||||
border-radius: 0;
|
||||
font-size: 1rem;
|
||||
}
|
||||
|
||||
.navbar-links a:hover,
|
||||
.navbar-links a.active {
|
||||
background: #2b6cb0;
|
||||
}
|
||||
|
||||
.btn-logout {
|
||||
margin: 0.5rem 1.5rem 0;
|
||||
width: calc(100% - 3rem);
|
||||
justify-content: center;
|
||||
}
|
||||
|
||||
.main-content {
|
||||
padding: 1rem;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
|
||||
@@ -33,7 +33,15 @@
|
||||
</div>
|
||||
|
||||
<div class="book-actions">
|
||||
<router-link
|
||||
v-if="book.status === 'READY'"
|
||||
:to="{ name: 'book-reader', params: { id: book.id } }"
|
||||
class="btn btn-secondary"
|
||||
>
|
||||
Read
|
||||
</router-link>
|
||||
<button
|
||||
v-if="deleteEnabled"
|
||||
class="btn btn-danger"
|
||||
:disabled="book.status === 'PROCESSING' || deleting"
|
||||
@click="$emit('delete', book.id)"
|
||||
@@ -52,6 +60,7 @@ import type { Book } from '@/stores/bookStore'
|
||||
const props = defineProps<{
|
||||
book: Book
|
||||
deleting?: boolean
|
||||
deleteEnabled?: boolean
|
||||
}>()
|
||||
|
||||
defineEmits<{
|
||||
@@ -181,6 +190,7 @@ function formatDate(iso: string): string {
|
||||
.book-actions {
|
||||
display: flex;
|
||||
justify-content: flex-end;
|
||||
gap: 0.5rem;
|
||||
margin-top: 0.25rem;
|
||||
}
|
||||
</style>
|
||||
|
||||
@@ -0,0 +1,239 @@
|
||||
<template>
|
||||
<div class="book-panel">
|
||||
<div class="book-panel-header">
|
||||
<span class="book-panel-title">{{ bookTitle || 'Book' }} — p. {{ page }}</span>
|
||||
<div class="book-panel-nav">
|
||||
<button class="nav-btn" :disabled="page <= 1" @click="emit('navigate', page - 1)">←</button>
|
||||
<button class="nav-btn" @click="emit('navigate', page + 1)">→</button>
|
||||
</div>
|
||||
<button class="close-btn" @click="emit('close')" title="Close">✕</button>
|
||||
</div>
|
||||
|
||||
<div class="book-panel-body">
|
||||
<div v-if="loading" class="panel-loading">
|
||||
<div class="spinner spinner-dark" style="width:24px;height:24px;margin:0 auto 0.5rem;"></div>
|
||||
<p>Loading page {{ page }}…</p>
|
||||
</div>
|
||||
<div v-else-if="error" class="panel-error">{{ error }}</div>
|
||||
<div v-else class="markdown-body" v-html="renderedHtml"></div>
|
||||
</div>
|
||||
</div>
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { ref, watch, onMounted, onUnmounted } from 'vue'
|
||||
import { api } from '@/services/api'
|
||||
|
||||
const props = defineProps<{
|
||||
bookId: string
|
||||
page: number
|
||||
bookTitle?: string
|
||||
}>()
|
||||
|
||||
const emit = defineEmits<{
|
||||
close: []
|
||||
navigate: [page: number]
|
||||
}>()
|
||||
|
||||
const loading = ref(false)
|
||||
const error = ref<string | null>(null)
|
||||
const renderedHtml = ref('')
|
||||
let activeBlobUrls: string[] = []
|
||||
|
||||
onMounted(() => loadPage(props.page))
|
||||
|
||||
watch(() => [props.bookId, props.page], () => loadPage(props.page))
|
||||
|
||||
onUnmounted(() => {
|
||||
activeBlobUrls.forEach(u => URL.revokeObjectURL(u))
|
||||
})
|
||||
|
||||
async function loadPage(page: number) {
|
||||
loading.value = true
|
||||
error.value = null
|
||||
renderedHtml.value = ''
|
||||
activeBlobUrls.forEach(u => URL.revokeObjectURL(u))
|
||||
activeBlobUrls = []
|
||||
|
||||
try {
|
||||
const res = await api.get<string>(`/books/${props.bookId}/pages/${page}/html`, {
|
||||
headers: { Accept: 'text/html' },
|
||||
responseType: 'text'
|
||||
})
|
||||
renderedHtml.value = await resolveImages(res.data)
|
||||
} catch (e: any) {
|
||||
error.value = e.message ?? 'Failed to load page.'
|
||||
} finally {
|
||||
loading.value = false
|
||||
}
|
||||
}
|
||||
|
||||
async function resolveImages(html: string): Promise<string> {
|
||||
const srcPattern = /src="(\/api\/v1\/figures\/[^"]+)"/g
|
||||
const matches = [...html.matchAll(srcPattern)]
|
||||
if (matches.length === 0) return html
|
||||
|
||||
const unique = [...new Set(matches.map(m => m[1]))]
|
||||
const blobMap: Record<string, string> = {}
|
||||
|
||||
await Promise.all(
|
||||
unique.map(async (src) => {
|
||||
try {
|
||||
const res = await api.get(src.replace(/^\/api\/v1/, ''), { responseType: 'blob' })
|
||||
const blobUrl = URL.createObjectURL(res.data)
|
||||
activeBlobUrls.push(blobUrl)
|
||||
blobMap[src] = blobUrl
|
||||
} catch {
|
||||
// leave original src
|
||||
}
|
||||
})
|
||||
)
|
||||
|
||||
return html.replace(/src="(\/api\/v1\/figures\/[^"]+)"/g, (_, src) =>
|
||||
blobMap[src] ? `src="${blobMap[src]}"` : `src="${src}"`
|
||||
)
|
||||
}
|
||||
</script>
|
||||
|
||||
<style scoped>
|
||||
.book-panel {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
height: 100%;
|
||||
background: white;
|
||||
border-left: 1px solid #e2e8f0;
|
||||
border-radius: 0 10px 10px 0;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.book-panel-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
padding: 0.6rem 0.75rem;
|
||||
background: #f7fafc;
|
||||
border-bottom: 1px solid #e2e8f0;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.book-panel-title {
|
||||
flex: 1;
|
||||
font-size: 0.8rem;
|
||||
font-weight: 600;
|
||||
color: #2b6cb0;
|
||||
white-space: nowrap;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
}
|
||||
|
||||
.book-panel-nav {
|
||||
display: flex;
|
||||
gap: 0.25rem;
|
||||
}
|
||||
|
||||
.nav-btn {
|
||||
width: 1.75rem;
|
||||
height: 1.75rem;
|
||||
border: 1px solid #cbd5e0;
|
||||
border-radius: 5px;
|
||||
background: white;
|
||||
cursor: pointer;
|
||||
font-size: 0.85rem;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
transition: background 0.15s;
|
||||
}
|
||||
.nav-btn:hover:not(:disabled) { background: #ebf8ff; border-color: #3182ce; }
|
||||
.nav-btn:disabled { opacity: 0.4; cursor: not-allowed; }
|
||||
|
||||
.close-btn {
|
||||
width: 1.75rem;
|
||||
height: 1.75rem;
|
||||
border: none;
|
||||
border-radius: 5px;
|
||||
background: none;
|
||||
cursor: pointer;
|
||||
font-size: 1rem;
|
||||
color: #718096;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
transition: background 0.15s, color 0.15s;
|
||||
}
|
||||
.close-btn:hover { background: #fed7d7; color: #742a2a; }
|
||||
|
||||
.book-panel-body {
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
padding: 1rem 1.25rem;
|
||||
}
|
||||
|
||||
.panel-loading {
|
||||
text-align: center;
|
||||
padding: 2rem;
|
||||
color: #718096;
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
.panel-error {
|
||||
padding: 1rem;
|
||||
background: #fff5f5;
|
||||
border: 1px solid #fed7d7;
|
||||
color: #742a2a;
|
||||
border-radius: 6px;
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
.markdown-body {
|
||||
font-size: 0.9rem;
|
||||
line-height: 1.75;
|
||||
color: #2d3748;
|
||||
}
|
||||
|
||||
.markdown-body :deep(h1),
|
||||
.markdown-body :deep(h2),
|
||||
.markdown-body :deep(h3) {
|
||||
color: #1a365d;
|
||||
font-weight: 600;
|
||||
margin: 1.25rem 0 0.5rem;
|
||||
}
|
||||
.markdown-body :deep(h2) { font-size: 1.05rem; border-bottom: 1px solid #e2e8f0; padding-bottom: 0.3rem; }
|
||||
.markdown-body :deep(h3) { font-size: 0.95rem; }
|
||||
.markdown-body :deep(p) { margin: 0.6rem 0; }
|
||||
.markdown-body :deep(img) {
|
||||
max-width: 100%;
|
||||
border-radius: 6px;
|
||||
display: block;
|
||||
margin: 0.75rem auto;
|
||||
box-shadow: 0 1px 4px rgba(0,0,0,0.12);
|
||||
}
|
||||
.markdown-body :deep(ul),
|
||||
.markdown-body :deep(ol) { padding-left: 1.4rem; margin: 0.5rem 0; }
|
||||
.markdown-body :deep(code) {
|
||||
background: #f7fafc;
|
||||
border: 1px solid #e2e8f0;
|
||||
border-radius: 3px;
|
||||
padding: 0.1em 0.3em;
|
||||
font-size: 0.85em;
|
||||
}
|
||||
.markdown-body :deep(blockquote) {
|
||||
border-left: 3px solid #3182ce;
|
||||
padding-left: 0.75rem;
|
||||
color: #4a5568;
|
||||
margin: 0.5rem 0;
|
||||
}
|
||||
.markdown-body :deep(table) {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
font-size: 0.875em;
|
||||
margin: 0.75rem 0;
|
||||
}
|
||||
.markdown-body :deep(th),
|
||||
.markdown-body :deep(td) {
|
||||
border: 1px solid #e2e8f0;
|
||||
padding: 0.35rem 0.6rem;
|
||||
text-align: left;
|
||||
}
|
||||
.markdown-body :deep(th) { background: #f7fafc; font-weight: 600; }
|
||||
</style>
|
||||
@@ -3,24 +3,65 @@
|
||||
<div class="message-bubble" :class="isUser ? 'bubble-user' : 'bubble-assistant'">
|
||||
<div class="message-role">{{ isUser ? 'You' : 'AI Teacher' }}</div>
|
||||
<div v-if="isUser" class="message-content">{{ message.content }}</div>
|
||||
<div v-else class="message-content message-content--markdown" v-html="renderedContent"></div>
|
||||
<div v-else class="message-content message-content--markdown" v-html="renderedWithBadges" @click="onContentClick"></div>
|
||||
|
||||
<!-- Source chips for assistant messages -->
|
||||
<!-- Sources for assistant messages -->
|
||||
<div v-if="!isUser && message.sources && message.sources.length > 0" class="message-sources">
|
||||
<div class="sources-label">Sources:</div>
|
||||
<div class="source-list">
|
||||
<div class="source-list" ref="sourceListEl">
|
||||
<!-- TEXT sources -->
|
||||
<div
|
||||
v-for="(source, idx) in message.sources"
|
||||
:key="idx"
|
||||
v-for="(source, idx) in textSources"
|
||||
:key="'text-' + idx"
|
||||
class="source-item"
|
||||
:class="{ 'source-item--active': activeRef === source.refLabel }"
|
||||
:data-ref-label="source.refLabel"
|
||||
>
|
||||
<div class="source-chip">
|
||||
<span class="source-book-icon">📖</span>
|
||||
<div
|
||||
class="source-chip source-chip--text"
|
||||
:class="{ 'source-chip--clickable': source.bookId && source.page }"
|
||||
@click="source.bookId && source.page ? emit('open-source', source.bookId, source.page) : undefined"
|
||||
>
|
||||
<span class="source-icon">📖</span>
|
||||
<span v-if="source.refLabel" class="source-ref-label">{{ source.refLabel }}</span>
|
||||
<span class="source-book-title">{{ source.bookTitle }}</span>
|
||||
<span v-if="source.page" class="source-page">p. {{ source.page }}</span>
|
||||
<span v-if="source.bookId && source.page" class="source-open-hint">↗</span>
|
||||
</div>
|
||||
<div v-if="source.chunkText" class="source-chunk">{{ source.chunkText }}</div>
|
||||
</div>
|
||||
|
||||
<!-- FIGURE sources -->
|
||||
<div
|
||||
v-for="(source, idx) in figureSources"
|
||||
:key="'fig-' + idx"
|
||||
class="source-item source-item--figure"
|
||||
:class="{ 'source-item--active': activeRef === source.refLabel }"
|
||||
:data-ref-label="source.refLabel"
|
||||
>
|
||||
<div
|
||||
class="source-chip source-chip--figure"
|
||||
:class="{ 'source-chip--clickable': source.bookId && source.page }"
|
||||
@click="source.bookId && source.page ? emit('open-source', source.bookId, source.page) : undefined"
|
||||
>
|
||||
<span class="source-icon">🖼️</span>
|
||||
<span v-if="source.refLabel" class="source-ref-label source-ref-label--figure">{{ source.refLabel }}</span>
|
||||
<span class="source-figure-label">{{ source.label || 'Figure' }}</span>
|
||||
<span v-if="source.page" class="source-page">p. {{ source.page }}</span>
|
||||
<span v-if="source.figureType" class="source-figure-type">{{ formatFigureType(source.figureType) }}</span>
|
||||
<span v-if="source.bookId && source.page" class="source-open-hint">↗</span>
|
||||
</div>
|
||||
<div v-if="source.caption" class="source-caption">{{ source.caption }}</div>
|
||||
<div class="source-figure-image">
|
||||
<img
|
||||
:src="source.imageUrl"
|
||||
:alt="source.caption || source.label || 'Figure'"
|
||||
class="figure-img"
|
||||
loading="lazy"
|
||||
@error="onImageError"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -30,16 +71,81 @@
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { computed } from 'vue'
|
||||
import { computed, ref } from 'vue'
|
||||
import { marked } from 'marked'
|
||||
import type { ChatMessage } from '@/stores/chatStore'
|
||||
import type { ChatMessage, ChatSource } from '@/stores/chatStore'
|
||||
|
||||
const props = defineProps<{
|
||||
message: ChatMessage
|
||||
}>()
|
||||
|
||||
const emit = defineEmits<{
|
||||
'open-source': [bookId: string, page: number]
|
||||
}>()
|
||||
|
||||
const isUser = computed(() => props.message.role === 'USER')
|
||||
const renderedContent = computed(() => marked.parse(props.message.content) as string)
|
||||
const activeRef = ref<string | null>(null)
|
||||
const sourceListEl = ref<HTMLElement | null>(null)
|
||||
|
||||
/** Replaces [S1]/[F1]-style labels in the rendered HTML with clickable badges. */
|
||||
const renderedWithBadges = computed(() => {
|
||||
const html = marked.parse(props.message.content) as string
|
||||
return html.replace(/\[(S|F)\d+\]/g, (match) => {
|
||||
const inner = match.slice(1, -1) // e.g. "S1"
|
||||
return `<span class="citation-badge" data-ref="${inner}" title="Jump to source ${inner}">${match}</span>`
|
||||
})
|
||||
})
|
||||
|
||||
function onContentClick(e: MouseEvent) {
|
||||
const target = e.target as HTMLElement
|
||||
if (!target.classList.contains('citation-badge')) return
|
||||
|
||||
const label = target.getAttribute('data-ref') // e.g. "S1" or "F1"
|
||||
if (!label) return
|
||||
|
||||
activeRef.value = activeRef.value === label ? null : label
|
||||
|
||||
// Scroll to the matching source chip
|
||||
const sourceEl = sourceListEl.value?.querySelector(`[data-ref-label="${label}"]`) as HTMLElement | null
|
||||
sourceEl?.scrollIntoView({ behavior: 'smooth', block: 'start' })
|
||||
|
||||
// Open the book at the referenced page
|
||||
const allSources = props.message.sources ?? []
|
||||
const source = allSources.find((s: ChatSource) => s.refLabel === label)
|
||||
if (source?.bookId && source.page) {
|
||||
emit('open-source', source.bookId, source.page)
|
||||
}
|
||||
}
|
||||
|
||||
const textSources = computed(() =>
|
||||
(props.message.sources ?? []).filter((s: ChatSource) => s.type === 'TEXT' || !s.type)
|
||||
)
|
||||
|
||||
const figureSources = computed(() =>
|
||||
(props.message.sources ?? []).filter((s: ChatSource) => s.type === 'FIGURE')
|
||||
)
|
||||
|
||||
function formatFigureType(type: string): string {
|
||||
const labels: Record<string, string> = {
|
||||
ANATOMICAL_DIAGRAM: 'Anatomical Diagram',
|
||||
SURGICAL_PHOTOGRAPH: 'Surgical Photo',
|
||||
MRI_CT_SCAN: 'MRI / CT',
|
||||
TABLE: 'Table',
|
||||
CHART: 'Chart',
|
||||
INTRAOPERATIVE_IMAGE: 'Intraoperative'
|
||||
}
|
||||
return labels[type] ?? type
|
||||
}
|
||||
|
||||
function onImageError(e: Event) {
|
||||
const img = e.target as HTMLImageElement
|
||||
img.alt = 'Image unavailable'
|
||||
img.style.display = 'none'
|
||||
const wrapper = img.parentElement
|
||||
if (wrapper) {
|
||||
wrapper.innerHTML = '<span class="figure-missing">Image unavailable</span>'
|
||||
}
|
||||
}
|
||||
|
||||
function formatTime(iso: string): string {
|
||||
return new Date(iso).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })
|
||||
@@ -182,6 +288,71 @@ function formatTime(iso: string): string {
|
||||
gap: 0.25rem;
|
||||
}
|
||||
|
||||
.source-item--figure {
|
||||
gap: 0.4rem;
|
||||
}
|
||||
|
||||
.source-chip {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.25rem;
|
||||
border-radius: 4px;
|
||||
padding: 0.2rem 0.5rem;
|
||||
font-size: 0.78rem;
|
||||
}
|
||||
|
||||
.source-chip--text {
|
||||
background: #ebf8ff;
|
||||
border: 1px solid #bee3f8;
|
||||
}
|
||||
|
||||
.source-chip--clickable {
|
||||
cursor: pointer;
|
||||
transition: background 0.15s, border-color 0.15s;
|
||||
}
|
||||
|
||||
.source-chip--clickable:hover {
|
||||
background: #bee3f8;
|
||||
border-color: #90cdf4;
|
||||
}
|
||||
|
||||
.source-open-hint {
|
||||
font-size: 0.75rem;
|
||||
color: #3182ce;
|
||||
margin-left: 0.1rem;
|
||||
}
|
||||
|
||||
.source-chip--figure {
|
||||
background: #f0fff4;
|
||||
border: 1px solid #9ae6b4;
|
||||
}
|
||||
|
||||
.source-icon {
|
||||
font-size: 0.8rem;
|
||||
}
|
||||
|
||||
.source-book-title {
|
||||
color: #2b6cb0;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.source-figure-label {
|
||||
color: #276749;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.source-figure-type {
|
||||
color: #718096;
|
||||
font-size: 0.72rem;
|
||||
background: #e2e8f0;
|
||||
border-radius: 3px;
|
||||
padding: 0 0.3rem;
|
||||
}
|
||||
|
||||
.source-page {
|
||||
color: #718096;
|
||||
}
|
||||
|
||||
.source-chunk {
|
||||
font-size: 0.78rem;
|
||||
color: #4a5568;
|
||||
@@ -194,28 +365,65 @@ function formatTime(iso: string): string {
|
||||
line-height: 1.5;
|
||||
}
|
||||
|
||||
.source-chip {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.25rem;
|
||||
background: #ebf8ff;
|
||||
border: 1px solid #bee3f8;
|
||||
border-radius: 4px;
|
||||
padding: 0.2rem 0.5rem;
|
||||
.source-caption {
|
||||
font-size: 0.78rem;
|
||||
color: #4a5568;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.source-book-icon {
|
||||
font-size: 0.8rem;
|
||||
.source-figure-image {
|
||||
max-width: 100%;
|
||||
}
|
||||
|
||||
.source-book-title {
|
||||
.figure-img {
|
||||
max-width: 100%;
|
||||
max-height: 300px;
|
||||
border-radius: 6px;
|
||||
border: 1px solid #e2e8f0;
|
||||
object-fit: contain;
|
||||
}
|
||||
|
||||
.figure-missing {
|
||||
font-size: 0.78rem;
|
||||
color: #a0aec0;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.message-content--markdown :deep(.citation-badge) {
|
||||
display: inline-block;
|
||||
background: #ebf8ff;
|
||||
border: 1px solid #90cdf4;
|
||||
border-radius: 3px;
|
||||
padding: 0 0.3em;
|
||||
font-size: 0.78em;
|
||||
font-weight: 600;
|
||||
color: #2b6cb0;
|
||||
font-weight: 500;
|
||||
cursor: pointer;
|
||||
user-select: none;
|
||||
transition: background 0.15s;
|
||||
}
|
||||
|
||||
.source-page {
|
||||
color: #718096;
|
||||
.message-content--markdown :deep(.citation-badge:hover) {
|
||||
background: #bee3f8;
|
||||
}
|
||||
|
||||
.source-item--active {
|
||||
outline: 2px solid #4299e1;
|
||||
border-radius: 6px;
|
||||
}
|
||||
|
||||
.source-ref-label {
|
||||
font-size: 0.72rem;
|
||||
font-weight: 700;
|
||||
background: #bee3f8;
|
||||
color: #2b6cb0;
|
||||
border-radius: 3px;
|
||||
padding: 0 0.3rem;
|
||||
}
|
||||
|
||||
.source-ref-label--figure {
|
||||
background: #9ae6b4;
|
||||
color: #276749;
|
||||
}
|
||||
|
||||
.message-timestamp {
|
||||
|
||||
Vendored
+2
@@ -3,6 +3,8 @@
|
||||
interface ImportMetaEnv {
|
||||
readonly VITE_API_URL: string
|
||||
readonly VITE_APP_PASSWORD: string
|
||||
readonly VITE_UPLOAD_ENABLED: string
|
||||
readonly VITE_DELETE_ENABLED: string
|
||||
}
|
||||
|
||||
interface ImportMeta {
|
||||
|
||||
@@ -0,0 +1,10 @@
|
||||
/**
|
||||
* Read a VITE_ env variable.
|
||||
* At runtime in Docker, values come from window.__env__ (injected by docker-entrypoint.sh).
|
||||
* At build time (dev / CI), values come from import.meta.env.
|
||||
*/
|
||||
export function env(key: string): string | undefined {
|
||||
const runtime = (window as Record<string, any>).__env__?.[key]
|
||||
if (runtime) return runtime
|
||||
return (import.meta as any).env?.[key]
|
||||
}
|
||||
+16
-1
@@ -4,6 +4,21 @@ import App from './App.vue'
|
||||
import router from './router'
|
||||
|
||||
const app = createApp(App)
|
||||
app.use(createPinia())
|
||||
const pinia = createPinia()
|
||||
app.use(pinia)
|
||||
app.use(router)
|
||||
|
||||
// Verify any session restored from sessionStorage is still valid.
|
||||
// If the backend rejects the credentials (e.g. password changed), clear them
|
||||
// before the router guard fires so the user lands on /login cleanly.
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
import { api } from '@/services/api'
|
||||
|
||||
const auth = useAuthStore()
|
||||
if (auth.isAuthenticated) {
|
||||
api.get('/auth/check').catch(() => {
|
||||
auth.clearCredentials()
|
||||
})
|
||||
}
|
||||
|
||||
app.mount('#app')
|
||||
|
||||
@@ -1,11 +1,19 @@
|
||||
import { createRouter, createWebHistory } from 'vue-router'
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
import LoginView from '@/views/LoginView.vue'
|
||||
import UploadView from '@/views/UploadView.vue'
|
||||
import TopicsView from '@/views/TopicsView.vue'
|
||||
import ChatView from '@/views/ChatView.vue'
|
||||
import BookReaderView from '@/views/BookReaderView.vue'
|
||||
|
||||
const router = createRouter({
|
||||
history: createWebHistory(import.meta.env.BASE_URL),
|
||||
routes: [
|
||||
{
|
||||
path: '/login',
|
||||
name: 'login',
|
||||
component: LoginView
|
||||
},
|
||||
{
|
||||
path: '/',
|
||||
name: 'upload',
|
||||
@@ -20,8 +28,20 @@ const router = createRouter({
|
||||
path: '/chat',
|
||||
name: 'chat',
|
||||
component: ChatView
|
||||
},
|
||||
{
|
||||
path: '/books/:id/read',
|
||||
name: 'book-reader',
|
||||
component: BookReaderView
|
||||
}
|
||||
]
|
||||
})
|
||||
|
||||
router.beforeEach((to) => {
|
||||
const auth = useAuthStore()
|
||||
if (to.name !== 'login' && !auth.isAuthenticated) {
|
||||
return { name: 'login' }
|
||||
}
|
||||
})
|
||||
|
||||
export default router
|
||||
|
||||
@@ -1,20 +1,30 @@
|
||||
import axios from 'axios'
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
import { env } from '@/env'
|
||||
|
||||
export const api = axios.create({
|
||||
baseURL: import.meta.env.VITE_API_URL ?? '/api/v1',
|
||||
auth: {
|
||||
username: 'neurosurgeon',
|
||||
password: import.meta.env.VITE_APP_PASSWORD ?? 'changeme'
|
||||
},
|
||||
baseURL: env('VITE_API_URL') ?? '/api/v1',
|
||||
headers: {
|
||||
'Content-Type': 'application/json'
|
||||
}
|
||||
})
|
||||
|
||||
// Response interceptor for error normalisation
|
||||
api.interceptors.request.use((config) => {
|
||||
const auth = useAuthStore()
|
||||
if (auth.username && auth.password) {
|
||||
config.auth = { username: auth.username, password: auth.password }
|
||||
}
|
||||
return config
|
||||
})
|
||||
|
||||
api.interceptors.response.use(
|
||||
(response) => response,
|
||||
(error) => {
|
||||
if (error.response?.status === 401) {
|
||||
useAuthStore().clearCredentials()
|
||||
window.location.href = '/login'
|
||||
return Promise.reject(new Error('Session expired. Please sign in again.'))
|
||||
}
|
||||
const message =
|
||||
error.response?.data?.error ??
|
||||
error.message ??
|
||||
|
||||
@@ -0,0 +1,28 @@
|
||||
import { defineStore } from 'pinia'
|
||||
import { ref, computed } from 'vue'
|
||||
|
||||
const SESSION_KEY = 'auth'
|
||||
|
||||
export const useAuthStore = defineStore('auth', () => {
|
||||
const stored = sessionStorage.getItem(SESSION_KEY)
|
||||
const parsed = stored ? (JSON.parse(stored) as { username: string; password: string }) : null
|
||||
|
||||
const username = ref<string | null>(parsed?.username ?? null)
|
||||
const password = ref<string | null>(parsed?.password ?? null)
|
||||
|
||||
const isAuthenticated = computed(() => !!username.value && !!password.value)
|
||||
|
||||
function setCredentials(u: string, p: string) {
|
||||
username.value = u
|
||||
password.value = p
|
||||
sessionStorage.setItem(SESSION_KEY, JSON.stringify({ username: u, password: p }))
|
||||
}
|
||||
|
||||
function clearCredentials() {
|
||||
username.value = null
|
||||
password.value = null
|
||||
sessionStorage.removeItem(SESSION_KEY)
|
||||
}
|
||||
|
||||
return { username, password, isAuthenticated, setCredentials, clearCredentials }
|
||||
})
|
||||
@@ -2,11 +2,27 @@ import { defineStore } from 'pinia'
|
||||
import { ref } from 'vue'
|
||||
import { api } from '@/services/api'
|
||||
|
||||
export interface ChatSource {
|
||||
type: 'TEXT' | 'FIGURE'
|
||||
bookId?: string
|
||||
bookTitle: string
|
||||
page: number | null
|
||||
refLabel?: string
|
||||
// TEXT-specific
|
||||
chunkText?: string
|
||||
// FIGURE-specific
|
||||
figureId?: string
|
||||
label?: string
|
||||
caption?: string
|
||||
figureType?: string
|
||||
imageUrl?: string
|
||||
}
|
||||
|
||||
export interface ChatMessage {
|
||||
id: string
|
||||
role: 'USER' | 'ASSISTANT'
|
||||
content: string
|
||||
sources: Array<{ bookTitle: string; page: number | null; chunkText?: string }>
|
||||
sources: ChatSource[]
|
||||
createdAt: string
|
||||
}
|
||||
|
||||
|
||||
@@ -10,11 +10,14 @@ export interface Topic {
|
||||
}
|
||||
|
||||
export interface SourceReference {
|
||||
bookId: string | null
|
||||
bookTitle: string
|
||||
page: number | null
|
||||
}
|
||||
|
||||
export interface TopicSummary {
|
||||
id: string
|
||||
summaryNumber: number
|
||||
topicId: string
|
||||
topicName: string
|
||||
summary: string
|
||||
@@ -22,12 +25,20 @@ export interface TopicSummary {
|
||||
generatedAt: string
|
||||
}
|
||||
|
||||
export interface SavedSummaryItem {
|
||||
id: string
|
||||
summaryNumber: number
|
||||
generatedAt: string
|
||||
}
|
||||
|
||||
export const useTopicStore = defineStore('topics', () => {
|
||||
const topics = ref<Topic[]>([])
|
||||
const activeSummary = ref<TopicSummary | null>(null)
|
||||
const activeSummaryTopicId = ref<string | null>(null)
|
||||
const summaryList = ref<SavedSummaryItem[]>([])
|
||||
const loading = ref(false)
|
||||
const summaryLoading = ref(false)
|
||||
const summaryListLoading = ref(false)
|
||||
const error = ref<string | null>(null)
|
||||
|
||||
async function fetchTopics() {
|
||||
@@ -43,6 +54,36 @@ export const useTopicStore = defineStore('topics', () => {
|
||||
}
|
||||
}
|
||||
|
||||
async function fetchSummaries(topicId: string) {
|
||||
summaryListLoading.value = true
|
||||
summaryList.value = []
|
||||
error.value = null
|
||||
try {
|
||||
const response = await api.get<SavedSummaryItem[]>(`/topics/${topicId}/summaries`)
|
||||
summaryList.value = response.data
|
||||
} catch (err: any) {
|
||||
error.value = err.message
|
||||
} finally {
|
||||
summaryListLoading.value = false
|
||||
}
|
||||
}
|
||||
|
||||
async function fetchSummaryDetail(topicId: string, summaryId: string): Promise<TopicSummary | null> {
|
||||
summaryLoading.value = true
|
||||
activeSummary.value = null
|
||||
error.value = null
|
||||
try {
|
||||
const response = await api.get<TopicSummary>(`/topics/${topicId}/summaries/${summaryId}`)
|
||||
activeSummary.value = response.data
|
||||
return response.data
|
||||
} catch (err: any) {
|
||||
error.value = err.message
|
||||
return null
|
||||
} finally {
|
||||
summaryLoading.value = false
|
||||
}
|
||||
}
|
||||
|
||||
async function generateSummary(topicId: string): Promise<TopicSummary | null> {
|
||||
summaryLoading.value = true
|
||||
activeSummaryTopicId.value = topicId
|
||||
@@ -65,10 +106,14 @@ export const useTopicStore = defineStore('topics', () => {
|
||||
topics,
|
||||
activeSummary,
|
||||
activeSummaryTopicId,
|
||||
summaryList,
|
||||
loading,
|
||||
summaryLoading,
|
||||
summaryListLoading,
|
||||
error,
|
||||
fetchTopics,
|
||||
fetchSummaries,
|
||||
fetchSummaryDetail,
|
||||
generateSummary
|
||||
}
|
||||
})
|
||||
|
||||
@@ -0,0 +1,335 @@
|
||||
<template>
|
||||
<div class="reader-view">
|
||||
<!-- Header -->
|
||||
<div class="reader-header">
|
||||
<router-link to="/" class="back-link">← Library</router-link>
|
||||
<div class="reader-title">
|
||||
<h1 class="book-title">{{ book?.title ?? 'Loading…' }}</h1>
|
||||
</div>
|
||||
<div class="page-nav">
|
||||
<button class="nav-btn" :disabled="currentPage <= 1" @click="goTo(currentPage - 1)">←</button>
|
||||
<form class="page-jump" @submit.prevent="onJump">
|
||||
<input
|
||||
v-model.number="jumpInput"
|
||||
type="number"
|
||||
:min="1"
|
||||
:max="book?.pageCount ?? 1"
|
||||
class="page-input"
|
||||
/>
|
||||
<span class="page-sep">/ {{ book?.pageCount ?? '…' }}</span>
|
||||
</form>
|
||||
<button class="nav-btn" :disabled="!book || currentPage >= book.pageCount!" @click="goTo(currentPage + 1)">→</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Content -->
|
||||
<div class="reader-body">
|
||||
<div v-if="loading" class="reader-loading">
|
||||
<div class="spinner spinner-dark" style="width:28px;height:28px;margin:0 auto 0.75rem;"></div>
|
||||
<p>Loading page {{ currentPage }}…</p>
|
||||
</div>
|
||||
|
||||
<div v-else-if="error" class="reader-error card">
|
||||
<strong>Could not load page {{ currentPage }}</strong><br />
|
||||
{{ error }}
|
||||
</div>
|
||||
|
||||
<div v-else class="reader-content card">
|
||||
<div class="markdown-body" v-html="renderedHtml"></div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { ref, watch, onMounted } from 'vue'
|
||||
import { useRoute } from 'vue-router'
|
||||
import { api } from '@/services/api'
|
||||
import { useBookStore } from '@/stores/bookStore'
|
||||
import type { Book } from '@/stores/bookStore'
|
||||
|
||||
const route = useRoute()
|
||||
const bookStore = useBookStore()
|
||||
|
||||
const bookId = route.params.id as string
|
||||
const book = ref<Book | null>(null)
|
||||
const currentPage = ref(1)
|
||||
const jumpInput = ref(1)
|
||||
const loading = ref(false)
|
||||
const error = ref<string | null>(null)
|
||||
const renderedHtml = ref('')
|
||||
|
||||
// Blob URLs created this session — revoked on next page load
|
||||
let activeBlobUrls: string[] = []
|
||||
|
||||
onMounted(async () => {
|
||||
book.value = bookStore.books.find(b => b.id === bookId) ?? null
|
||||
if (!book.value) {
|
||||
try {
|
||||
const res = await api.get<Book>(`/books/${bookId}`)
|
||||
book.value = res.data
|
||||
} catch {
|
||||
error.value = 'Book not found.'
|
||||
return
|
||||
}
|
||||
}
|
||||
await loadPage(1)
|
||||
})
|
||||
|
||||
watch(currentPage, (page) => {
|
||||
jumpInput.value = page
|
||||
loadPage(page)
|
||||
})
|
||||
|
||||
async function goTo(page: number) {
|
||||
if (!book.value) return
|
||||
const clamped = Math.max(1, Math.min(page, book.value.pageCount ?? 1))
|
||||
if (clamped !== currentPage.value) {
|
||||
currentPage.value = clamped
|
||||
}
|
||||
}
|
||||
|
||||
function onJump() {
|
||||
goTo(jumpInput.value)
|
||||
}
|
||||
|
||||
async function loadPage(page: number) {
|
||||
loading.value = true
|
||||
error.value = null
|
||||
renderedHtml.value = ''
|
||||
|
||||
// Revoke previous blob URLs to free memory
|
||||
activeBlobUrls.forEach(u => URL.revokeObjectURL(u))
|
||||
activeBlobUrls = []
|
||||
|
||||
try {
|
||||
const res = await api.get<string>(`/books/${bookId}/pages/${page}/html`, {
|
||||
headers: { Accept: 'text/html' },
|
||||
responseType: 'text'
|
||||
})
|
||||
let html = await resolveImages(res.data)
|
||||
renderedHtml.value = html
|
||||
} catch (e: any) {
|
||||
error.value = e.message ?? 'Failed to load page.'
|
||||
} finally {
|
||||
loading.value = false
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Finds <img src="/api/v1/figures/..."> in the HTML, fetches each image
|
||||
* through the authenticated axios instance, and replaces the src with a
|
||||
* temporary blob URL so the browser can render it without re-authenticating.
|
||||
*/
|
||||
async function resolveImages(html: string): Promise<string> {
|
||||
const srcPattern = /src="(\/api\/v1\/figures\/[^"]+)"/g
|
||||
const matches = [...html.matchAll(srcPattern)]
|
||||
if (matches.length === 0) return html
|
||||
|
||||
const unique = [...new Set(matches.map(m => m[1]))]
|
||||
const blobMap: Record<string, string> = {}
|
||||
|
||||
await Promise.all(
|
||||
unique.map(async (src) => {
|
||||
try {
|
||||
const res = await api.get(src.replace(/^\/api\/v1/, ''), { responseType: 'blob' })
|
||||
const blobUrl = URL.createObjectURL(res.data)
|
||||
activeBlobUrls.push(blobUrl)
|
||||
blobMap[src] = blobUrl
|
||||
} catch {
|
||||
// leave original src — browser will attempt (and likely fail silently)
|
||||
}
|
||||
})
|
||||
)
|
||||
|
||||
return html.replace(/src="(\/api\/v1\/figures\/[^"]+)"/g, (_, src) =>
|
||||
blobMap[src] ? `src="${blobMap[src]}"` : `src="${src}"`
|
||||
)
|
||||
}
|
||||
</script>
|
||||
|
||||
<style scoped>
|
||||
.reader-view {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 1rem;
|
||||
max-width: 860px;
|
||||
margin: 0 auto;
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
}
|
||||
|
||||
.reader-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 1rem;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.back-link {
|
||||
color: #3182ce;
|
||||
text-decoration: none;
|
||||
font-size: 0.9rem;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.back-link:hover { text-decoration: underline; }
|
||||
|
||||
.reader-title {
|
||||
flex: 1;
|
||||
min-width: 0;
|
||||
}
|
||||
|
||||
.book-title {
|
||||
font-size: 1.1rem;
|
||||
font-weight: 600;
|
||||
color: #1a365d;
|
||||
white-space: nowrap;
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
}
|
||||
|
||||
.page-nav {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
.nav-btn {
|
||||
width: 2rem;
|
||||
height: 2rem;
|
||||
border: 1px solid #cbd5e0;
|
||||
border-radius: 6px;
|
||||
background: #fff;
|
||||
cursor: pointer;
|
||||
font-size: 1rem;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
transition: background 0.15s;
|
||||
}
|
||||
.nav-btn:hover:not(:disabled) { background: #ebf8ff; border-color: #3182ce; }
|
||||
.nav-btn:disabled { opacity: 0.4; cursor: not-allowed; }
|
||||
|
||||
.page-jump {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.35rem;
|
||||
}
|
||||
|
||||
.page-input {
|
||||
width: 3.5rem;
|
||||
text-align: center;
|
||||
border: 1px solid #cbd5e0;
|
||||
border-radius: 6px;
|
||||
padding: 0.25rem 0.4rem;
|
||||
font-size: 0.9rem;
|
||||
color: #2d3748;
|
||||
}
|
||||
.page-input:focus { outline: none; border-color: #3182ce; }
|
||||
|
||||
.page-sep {
|
||||
font-size: 0.85rem;
|
||||
color: #718096;
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
.reader-body {
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
.reader-loading {
|
||||
text-align: center;
|
||||
padding: 3rem;
|
||||
color: #718096;
|
||||
}
|
||||
|
||||
.reader-error {
|
||||
padding: 1.25rem;
|
||||
background: #fff5f5;
|
||||
border: 1px solid #fed7d7;
|
||||
color: #742a2a;
|
||||
border-radius: 8px;
|
||||
}
|
||||
|
||||
.reader-content {
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
overflow-y: auto;
|
||||
padding: 2rem;
|
||||
}
|
||||
|
||||
/* Markdown rendering */
|
||||
.markdown-body {
|
||||
font-size: 0.95rem;
|
||||
line-height: 1.75;
|
||||
color: #2d3748;
|
||||
}
|
||||
|
||||
.markdown-body :deep(h1),
|
||||
.markdown-body :deep(h2),
|
||||
.markdown-body :deep(h3) {
|
||||
color: #1a365d;
|
||||
font-weight: 600;
|
||||
margin: 1.5rem 0 0.75rem;
|
||||
}
|
||||
.markdown-body :deep(h2) { font-size: 1.15rem; border-bottom: 1px solid #e2e8f0; padding-bottom: 0.4rem; }
|
||||
.markdown-body :deep(h3) { font-size: 1rem; }
|
||||
|
||||
.markdown-body :deep(p) { margin: 0.75rem 0; }
|
||||
|
||||
.markdown-body :deep(img) {
|
||||
max-width: 100%;
|
||||
border-radius: 6px;
|
||||
display: block;
|
||||
margin: 1rem auto;
|
||||
box-shadow: 0 1px 4px rgba(0,0,0,0.12);
|
||||
}
|
||||
|
||||
.markdown-body :deep(ul),
|
||||
.markdown-body :deep(ol) {
|
||||
padding-left: 1.5rem;
|
||||
margin: 0.75rem 0;
|
||||
}
|
||||
|
||||
.markdown-body :deep(code) {
|
||||
background: #f7fafc;
|
||||
border: 1px solid #e2e8f0;
|
||||
border-radius: 3px;
|
||||
padding: 0.1em 0.35em;
|
||||
font-size: 0.88em;
|
||||
}
|
||||
|
||||
.markdown-body :deep(blockquote) {
|
||||
border-left: 3px solid #3182ce;
|
||||
padding-left: 1rem;
|
||||
color: #4a5568;
|
||||
margin: 0.75rem 0;
|
||||
}
|
||||
|
||||
.markdown-body :deep(table) {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
font-size: 0.9em;
|
||||
margin: 1rem 0;
|
||||
}
|
||||
.markdown-body :deep(th),
|
||||
.markdown-body :deep(td) {
|
||||
border: 1px solid #e2e8f0;
|
||||
padding: 0.4rem 0.75rem;
|
||||
text-align: left;
|
||||
}
|
||||
.markdown-body :deep(th) { background: #f7fafc; font-weight: 600; }
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.reader-view {
|
||||
max-width: 100%;
|
||||
}
|
||||
|
||||
.reader-content {
|
||||
padding: 1rem;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
@@ -3,27 +3,10 @@
|
||||
<h1 class="page-title">Knowledge Chat</h1>
|
||||
<p class="page-subtitle">Ask questions grounded in your uploaded medical textbooks.</p>
|
||||
|
||||
<!-- Step 1: Topic Selection -->
|
||||
<div v-if="!chatStore.session && !selectedTopic" class="topic-selection">
|
||||
<h2 class="section-title">Select a Topic</h2>
|
||||
<div class="topic-grid">
|
||||
<button
|
||||
v-for="topic in topicStore.topics"
|
||||
:key="topic.id"
|
||||
:class="['topic-tile', { 'topic-tile-freeform': topic.id === 'free-form' }]"
|
||||
@click="handleTopicSelect(topic)"
|
||||
>
|
||||
<span class="topic-tile-name">{{ topic.name }}</span>
|
||||
<span v-if="topic.id === 'free-form'" class="topic-tile-hint">Any neurosurgery question</span>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Step 2: Topic selected — previous sessions + new chat -->
|
||||
<div v-else-if="!chatStore.session && selectedTopic" class="session-setup card">
|
||||
<!-- Session selection -->
|
||||
<div v-if="!chatStore.session" class="session-setup card">
|
||||
<div class="setup-header">
|
||||
<button class="btn-back" @click="handleBack">← Topics</button>
|
||||
<h2 class="section-title">{{ selectedTopic.name }}</h2>
|
||||
<h2 class="section-title">Free-form Chat</h2>
|
||||
</div>
|
||||
|
||||
<div v-if="chatStore.error" class="error-banner">{{ chatStore.error }}</div>
|
||||
@@ -71,6 +54,10 @@
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Chat + Reader split -->
|
||||
<div class="chat-reader-split">
|
||||
<!-- Messages + Input -->
|
||||
<div class="chat-column">
|
||||
<!-- Messages Area -->
|
||||
<div class="messages-container" ref="messagesContainer">
|
||||
<div v-if="chatStore.loading && chatStore.messages.length === 0" class="empty-state">
|
||||
@@ -89,6 +76,7 @@
|
||||
v-for="message in chatStore.messages"
|
||||
:key="message.id"
|
||||
:message="message"
|
||||
@open-source="handleOpenSource"
|
||||
/>
|
||||
<div v-if="chatStore.sending" class="typing-indicator">
|
||||
<div class="typing-bubble">
|
||||
@@ -123,6 +111,19 @@
|
||||
<p class="input-hint">Press Enter to send, Shift+Enter for new line.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Inline book reader panel -->
|
||||
<BookPagePanel
|
||||
v-if="readerPanel"
|
||||
:book-id="readerPanel.bookId"
|
||||
:page="readerPanel.page"
|
||||
:book-title="readerPanel.bookTitle"
|
||||
class="reader-panel"
|
||||
@close="readerPanel = null"
|
||||
@navigate="(p) => readerPanel && (readerPanel.page = p)"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</template>
|
||||
|
||||
@@ -130,8 +131,10 @@
|
||||
import { ref, nextTick, onMounted, watch, inject } from 'vue'
|
||||
import { useChatStore } from '@/stores/chatStore'
|
||||
import { useTopicStore } from '@/stores/topicStore'
|
||||
import { useBookStore } from '@/stores/bookStore'
|
||||
import type { ChatSession } from '@/stores/chatStore'
|
||||
import ChatMessage from '@/components/ChatMessage.vue'
|
||||
import BookPagePanel from '@/components/BookPagePanel.vue'
|
||||
|
||||
interface Topic {
|
||||
id: string
|
||||
@@ -142,6 +145,7 @@ interface Topic {
|
||||
|
||||
const chatStore = useChatStore()
|
||||
const topicStore = useTopicStore()
|
||||
const bookStore = useBookStore()
|
||||
const showToast = inject<(msg: string, type?: 'error' | 'success') => void>('showToast')
|
||||
|
||||
const selectedTopic = ref<Topic | null>(null)
|
||||
@@ -150,10 +154,22 @@ const loadingTopicSessions = ref(false)
|
||||
const inputText = ref('')
|
||||
const messagesContainer = ref<HTMLElement | null>(null)
|
||||
|
||||
interface ReaderPanel { bookId: string; page: number; bookTitle?: string }
|
||||
const readerPanel = ref<ReaderPanel | null>(null)
|
||||
|
||||
function handleOpenSource(bookId: string, page: number) {
|
||||
const book = bookStore.books.find(b => b.id === bookId)
|
||||
readerPanel.value = { bookId, page, bookTitle: book?.title }
|
||||
}
|
||||
|
||||
onMounted(async () => {
|
||||
if (topicStore.topics.length === 0) {
|
||||
await topicStore.fetchTopics()
|
||||
}
|
||||
const freeForm = topicStore.topics.find((t) => t.id === 'free-form')
|
||||
if (freeForm) {
|
||||
await handleTopicSelect(freeForm)
|
||||
}
|
||||
})
|
||||
|
||||
watch(
|
||||
@@ -189,11 +205,6 @@ async function handleTopicSelect(topic: Topic) {
|
||||
loadingTopicSessions.value = false
|
||||
}
|
||||
|
||||
function handleBack() {
|
||||
selectedTopic.value = null
|
||||
topicSessions.value = []
|
||||
}
|
||||
|
||||
async function handleNewChat() {
|
||||
const ok = await chatStore.createSession(selectedTopic.value!.id)
|
||||
if (!ok) {
|
||||
@@ -207,9 +218,7 @@ async function handleResumeSession(session: ChatSession) {
|
||||
}
|
||||
|
||||
function handleLeaveSession() {
|
||||
// Leave without deleting — session stays in DB and will appear in "Previous Chats"
|
||||
chatStore.leaveSession()
|
||||
// Refresh the sessions list for the current topic
|
||||
if (selectedTopic.value) {
|
||||
loadingTopicSessions.value = true
|
||||
chatStore.fetchSessionsByTopic(selectedTopic.value.id).then((sessions) => {
|
||||
@@ -231,12 +240,6 @@ async function handleSend() {
|
||||
</script>
|
||||
|
||||
<style scoped>
|
||||
.topic-selection {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 1.25rem;
|
||||
}
|
||||
|
||||
.section-title {
|
||||
font-size: 1.1rem;
|
||||
font-weight: 600;
|
||||
@@ -244,52 +247,6 @@ async function handleSend() {
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
.topic-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fill, minmax(220px, 1fr));
|
||||
gap: 0.75rem;
|
||||
}
|
||||
|
||||
.topic-tile {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: flex-start;
|
||||
gap: 0.25rem;
|
||||
padding: 1rem 1.1rem;
|
||||
background: white;
|
||||
border: 1px solid #e2e8f0;
|
||||
border-radius: 8px;
|
||||
cursor: pointer;
|
||||
text-align: left;
|
||||
transition: border-color 0.15s, box-shadow 0.15s;
|
||||
}
|
||||
|
||||
.topic-tile:hover {
|
||||
border-color: #3182ce;
|
||||
box-shadow: 0 2px 8px rgba(49, 130, 206, 0.15);
|
||||
}
|
||||
|
||||
.topic-tile-freeform {
|
||||
border-style: dashed;
|
||||
border-color: #a0aec0;
|
||||
}
|
||||
|
||||
.topic-tile-freeform:hover {
|
||||
border-color: #718096;
|
||||
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.08);
|
||||
}
|
||||
|
||||
.topic-tile-name {
|
||||
font-size: 0.9rem;
|
||||
font-weight: 600;
|
||||
color: #2d3748;
|
||||
}
|
||||
|
||||
.topic-tile-hint {
|
||||
font-size: 0.78rem;
|
||||
color: #a0aec0;
|
||||
}
|
||||
|
||||
.session-setup {
|
||||
max-width: 540px;
|
||||
}
|
||||
@@ -381,6 +338,29 @@ async function handleSend() {
|
||||
min-height: 500px;
|
||||
}
|
||||
|
||||
.chat-reader-split {
|
||||
display: flex;
|
||||
flex: 1;
|
||||
min-height: 0;
|
||||
gap: 0;
|
||||
}
|
||||
|
||||
.chat-column {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
flex: 1;
|
||||
min-width: 0;
|
||||
gap: 1rem;
|
||||
}
|
||||
|
||||
.reader-panel {
|
||||
width: 420px;
|
||||
flex-shrink: 0;
|
||||
border-radius: 10px;
|
||||
margin-left: 1rem;
|
||||
box-shadow: -2px 0 8px rgba(0, 0, 0, 0.07);
|
||||
}
|
||||
|
||||
.session-bar {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
@@ -505,4 +485,26 @@ async function handleSend() {
|
||||
font-size: 0.875rem;
|
||||
margin-bottom: 0.75rem;
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.chat-layout {
|
||||
height: auto;
|
||||
min-height: unset;
|
||||
}
|
||||
|
||||
.chat-reader-split {
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
.chat-column {
|
||||
min-height: 60vh;
|
||||
}
|
||||
|
||||
.reader-panel {
|
||||
width: 100%;
|
||||
margin-left: 0;
|
||||
margin-top: 1rem;
|
||||
box-shadow: none;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
|
||||
@@ -0,0 +1,183 @@
|
||||
<template>
|
||||
<div class="login-wrapper">
|
||||
<div class="login-card card">
|
||||
<div class="login-header">
|
||||
<span class="login-icon">🧠</span>
|
||||
<h1 class="login-title">AI Teacher</h1>
|
||||
<p class="login-subtitle">Neurosurgeon Learning Platform</p>
|
||||
</div>
|
||||
|
||||
<form class="login-form" @submit.prevent="handleSubmit">
|
||||
<div class="form-group">
|
||||
<label for="username">Username</label>
|
||||
<input
|
||||
id="username"
|
||||
v-model="username"
|
||||
type="text"
|
||||
autocomplete="username"
|
||||
required
|
||||
:disabled="loading"
|
||||
/>
|
||||
</div>
|
||||
|
||||
<div class="form-group">
|
||||
<label for="password">Password</label>
|
||||
<input
|
||||
id="password"
|
||||
v-model="password"
|
||||
type="password"
|
||||
autocomplete="current-password"
|
||||
required
|
||||
:disabled="loading"
|
||||
/>
|
||||
</div>
|
||||
|
||||
<div v-if="errorMessage" class="login-error">
|
||||
{{ errorMessage }}
|
||||
</div>
|
||||
|
||||
<button type="submit" class="btn btn-primary login-btn" :disabled="loading || !username || !password">
|
||||
<span v-if="loading" class="spinner"></span>
|
||||
<span v-else>Sign in</span>
|
||||
</button>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { ref } from 'vue'
|
||||
import { useRouter } from 'vue-router'
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
import { api } from '@/services/api'
|
||||
|
||||
const router = useRouter()
|
||||
const authStore = useAuthStore()
|
||||
|
||||
const username = ref('')
|
||||
const password = ref('')
|
||||
const loading = ref(false)
|
||||
const errorMessage = ref('')
|
||||
|
||||
async function handleSubmit() {
|
||||
errorMessage.value = ''
|
||||
loading.value = true
|
||||
|
||||
authStore.setCredentials(username.value, password.value)
|
||||
|
||||
try {
|
||||
await api.get('/auth/check')
|
||||
router.push('/')
|
||||
} catch {
|
||||
authStore.clearCredentials()
|
||||
errorMessage.value = 'Invalid username or password.'
|
||||
} finally {
|
||||
loading.value = false
|
||||
}
|
||||
}
|
||||
</script>
|
||||
|
||||
<style scoped>
|
||||
.login-wrapper {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
min-height: 100vh;
|
||||
background: #f0f4f8;
|
||||
}
|
||||
|
||||
.login-card {
|
||||
width: 100%;
|
||||
max-width: 380px;
|
||||
padding: 2rem;
|
||||
}
|
||||
|
||||
.login-header {
|
||||
text-align: center;
|
||||
margin-bottom: 1.75rem;
|
||||
}
|
||||
|
||||
.login-icon {
|
||||
font-size: 2.5rem;
|
||||
display: block;
|
||||
margin-bottom: 0.5rem;
|
||||
}
|
||||
|
||||
.login-title {
|
||||
font-size: 1.5rem;
|
||||
font-weight: 700;
|
||||
color: #1a365d;
|
||||
margin-bottom: 0.25rem;
|
||||
}
|
||||
|
||||
.login-subtitle {
|
||||
font-size: 0.85rem;
|
||||
color: #718096;
|
||||
}
|
||||
|
||||
.login-form {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 1rem;
|
||||
}
|
||||
|
||||
.form-group {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 0.35rem;
|
||||
}
|
||||
|
||||
.form-group label {
|
||||
font-size: 0.875rem;
|
||||
font-weight: 600;
|
||||
color: #4a5568;
|
||||
}
|
||||
|
||||
.form-group input {
|
||||
padding: 0.6rem 0.75rem;
|
||||
border: 1px solid #cbd5e0;
|
||||
border-radius: 6px;
|
||||
font-size: 0.95rem;
|
||||
outline: none;
|
||||
transition: border-color 0.15s;
|
||||
}
|
||||
|
||||
.form-group input:focus {
|
||||
border-color: #3182ce;
|
||||
box-shadow: 0 0 0 3px rgba(49, 130, 206, 0.15);
|
||||
}
|
||||
|
||||
.form-group input:disabled {
|
||||
background: #f7fafc;
|
||||
color: #a0aec0;
|
||||
}
|
||||
|
||||
.login-error {
|
||||
padding: 0.6rem 0.75rem;
|
||||
background: #fed7d7;
|
||||
color: #c53030;
|
||||
border: 1px solid #fc8181;
|
||||
border-radius: 6px;
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
.login-btn {
|
||||
width: 100%;
|
||||
justify-content: center;
|
||||
padding: 0.7rem;
|
||||
font-size: 0.95rem;
|
||||
margin-top: 0.25rem;
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.login-wrapper {
|
||||
align-items: flex-start;
|
||||
padding-top: 2rem;
|
||||
min-height: unset;
|
||||
}
|
||||
|
||||
.login-card {
|
||||
max-width: 100%;
|
||||
}
|
||||
}
|
||||
</style>
|
||||
@@ -1,7 +1,7 @@
|
||||
<template>
|
||||
<div class="topics-view">
|
||||
<h1 class="page-title">Topics</h1>
|
||||
<p class="page-subtitle">Select a topic to generate an AI-powered summary from uploaded books.</p>
|
||||
<p class="page-subtitle">Select a topic to view or generate an AI-powered summary from uploaded books.</p>
|
||||
|
||||
<!-- Loading state -->
|
||||
<div v-if="topicStore.loading" class="empty-state">
|
||||
@@ -18,15 +18,39 @@
|
||||
</div>
|
||||
|
||||
<div v-else class="topics-layout">
|
||||
<!-- Topic Grid -->
|
||||
<div class="topic-grid">
|
||||
<TopicCard
|
||||
v-for="topic in topicStore.topics"
|
||||
:key="topic.id"
|
||||
:topic="topic"
|
||||
:is-generating="topicStore.activeSummaryTopicId === topic.id"
|
||||
@generate="handleGenerate"
|
||||
/>
|
||||
<div class="topics-main">
|
||||
|
||||
<!-- Summary history list -->
|
||||
<div v-if="selectedTopicId" class="history-panel card">
|
||||
<div class="history-header">
|
||||
<span class="history-title">Saved summaries</span>
|
||||
<button class="btn btn-primary btn-sm" :disabled="topicStore.summaryLoading" @click="handleGenerate(selectedTopicId!)">
|
||||
<span v-if="topicStore.summaryLoading" class="spinner" style="width:14px;height:14px;display:inline-block;vertical-align:middle;margin-right:4px;"></span>
|
||||
Generate New
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<div v-if="topicStore.summaryListLoading" class="history-loading">
|
||||
<div class="spinner spinner-dark" style="width:20px;height:20px;margin-right:8px;display:inline-block;vertical-align:middle;"></div>
|
||||
Loading...
|
||||
</div>
|
||||
|
||||
<div v-else-if="topicStore.summaryList.length === 0" class="history-empty">
|
||||
No summaries yet. Click "Generate New" to create one.
|
||||
</div>
|
||||
|
||||
<div v-else class="history-list">
|
||||
<button
|
||||
v-for="item in topicStore.summaryList"
|
||||
:key="item.id"
|
||||
class="history-chip"
|
||||
:class="{ 'history-chip--active': topicStore.activeSummary?.id === item.id }"
|
||||
@click="handleLoadSummary(item)"
|
||||
>
|
||||
Summary #{{ item.summaryNumber }}
|
||||
<span class="history-chip-date">· {{ formatDateShort(item.generatedAt) }}</span>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Summary Panel -->
|
||||
@@ -48,15 +72,26 @@
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div v-else-if="topicStore.activeSummary" class="summary-panel card">
|
||||
<div class="summary-header">
|
||||
<h2 class="summary-topic-name">{{ topicStore.activeSummary.topicName }}</h2>
|
||||
<span class="summary-timestamp">{{ formatDate(topicStore.activeSummary.generatedAt) }}</span>
|
||||
<div v-else-if="!topicStore.activeSummary" class="summary-panel card summary-placeholder">
|
||||
<p class="summary-placeholder-text">
|
||||
{{ selectedTopicId ? 'Select a saved summary or generate a new one.' : 'Select a topic to get started.' }}
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div class="summary-text">{{ topicStore.activeSummary.summary }}</div>
|
||||
<div v-else class="summary-panel card">
|
||||
<div class="summary-header">
|
||||
<h2 class="summary-topic-name">{{ topicStore.activeSummary.topicName }}</h2>
|
||||
<div class="summary-meta">
|
||||
<span v-if="topicStore.activeSummary.summaryNumber" class="summary-number">
|
||||
Summary #{{ topicStore.activeSummary.summaryNumber }}
|
||||
</span>
|
||||
<span class="summary-timestamp">{{ formatDate(topicStore.activeSummary.generatedAt) }}</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div v-if="topicStore.activeSummary.sources.length > 0" class="sources-section">
|
||||
<div class="summary-text summary-text--markdown" v-html="renderedSummary" @click="handleSummaryClick"></div>
|
||||
|
||||
<div ref="sourcesSection" v-if="topicStore.activeSummary.sources.length > 0" class="sources-section">
|
||||
<button class="sources-toggle" @click="showSources = !showSources">
|
||||
Sources ({{ topicStore.activeSummary.sources.length }})
|
||||
<span>{{ showSources ? '▲' : '▼' }}</span>
|
||||
@@ -66,35 +101,115 @@
|
||||
v-for="(source, idx) in topicStore.activeSummary.sources"
|
||||
:key="idx"
|
||||
class="source-chip"
|
||||
:class="{ 'source-chip--clickable': source.bookId && source.page }"
|
||||
@click="source.bookId && source.page ? handleOpenSource(source.bookId, source.page) : undefined"
|
||||
>
|
||||
<span class="source-icon">📖</span>
|
||||
<span class="source-book">{{ source.bookTitle }}</span>
|
||||
<span v-if="source.page" class="source-page">p. {{ source.page }}</span>
|
||||
<span v-if="source.page" class="source-page">p. {{ source.page }}</span>
|
||||
<span v-if="source.bookId && source.page" class="source-open-hint">↗</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<BookPagePanel
|
||||
v-if="readerPanel"
|
||||
:book-id="readerPanel.bookId"
|
||||
:page="readerPanel.page"
|
||||
:book-title="readerPanel.bookTitle"
|
||||
class="reader-panel"
|
||||
@close="readerPanel = null"
|
||||
@navigate="(p) => readerPanel && (readerPanel.page = p)"
|
||||
/>
|
||||
</div>
|
||||
<div v-else class="no-sources">
|
||||
No source citations available for this summary.
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Topic Grid -->
|
||||
<div class="topic-grid">
|
||||
<TopicCard
|
||||
v-for="topic in summaryTopics"
|
||||
:key="topic.id"
|
||||
:topic="topic"
|
||||
:is-generating="topicStore.activeSummaryTopicId === topic.id"
|
||||
:is-selected="selectedTopicId === topic.id"
|
||||
@generate="handleTopicClick"
|
||||
/>
|
||||
</div>
|
||||
</div><!-- end topics-main -->
|
||||
</div>
|
||||
</div>
|
||||
</template>
|
||||
|
||||
<script setup lang="ts">
|
||||
import { ref, onMounted, inject } from 'vue'
|
||||
import { ref, computed, onMounted, inject } from 'vue'
|
||||
import { marked } from 'marked'
|
||||
import { RouterLink } from 'vue-router'
|
||||
import { useTopicStore } from '@/stores/topicStore'
|
||||
import { useTopicStore, type SavedSummaryItem } from '@/stores/topicStore'
|
||||
import { useBookStore } from '@/stores/bookStore'
|
||||
import TopicCard from '@/components/TopicCard.vue'
|
||||
import BookPagePanel from '@/components/BookPagePanel.vue'
|
||||
|
||||
const topicStore = useTopicStore()
|
||||
const bookStore = useBookStore()
|
||||
const showToast = inject<(msg: string, type?: 'error' | 'success') => void>('showToast')
|
||||
|
||||
const showSources = ref(true)
|
||||
const summaryError = ref<string | null>(null)
|
||||
const isNoBooks = ref(false)
|
||||
const sourcesSection = ref<HTMLElement | null>(null)
|
||||
const selectedTopicId = ref<string | null>(null)
|
||||
|
||||
interface ReaderPanel { bookId: string; page: number; bookTitle?: string }
|
||||
const readerPanel = ref<ReaderPanel | null>(null)
|
||||
|
||||
const summaryTopics = computed(() => topicStore.topics.filter(t => t.id !== 'free-form'))
|
||||
|
||||
const renderedSummary = computed(() => {
|
||||
if (!topicStore.activeSummary) return ''
|
||||
const html = marked.parse(topicStore.activeSummary.summary) as string
|
||||
return html.replace(/\[S(\d+)\]/g, '<span class="source-ref">[S$1]</span>')
|
||||
})
|
||||
|
||||
function handleSummaryClick(e: MouseEvent) {
|
||||
if ((e.target as HTMLElement).classList.contains('source-ref')) {
|
||||
showSources.value = true
|
||||
sourcesSection.value?.scrollIntoView({ behavior: 'smooth', block: 'start' })
|
||||
}
|
||||
}
|
||||
|
||||
function handleOpenSource(bookId: string, page: number) {
|
||||
const book = bookStore.books.find(b => b.id === bookId)
|
||||
readerPanel.value = { bookId, page, bookTitle: book?.title }
|
||||
showSources.value = true
|
||||
}
|
||||
|
||||
async function handleTopicClick(topicId: string) {
|
||||
if (selectedTopicId.value !== topicId) {
|
||||
selectedTopicId.value = topicId
|
||||
topicStore.activeSummary = null
|
||||
summaryError.value = null
|
||||
await topicStore.fetchSummaries(topicId)
|
||||
// Auto-load the latest summary if any exist
|
||||
const list = topicStore.summaryList
|
||||
if (list.length > 0) {
|
||||
await topicStore.fetchSummaryDetail(topicId, list[list.length - 1].id)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
async function handleLoadSummary(item: SavedSummaryItem) {
|
||||
if (!selectedTopicId.value) return
|
||||
summaryError.value = null
|
||||
await topicStore.fetchSummaryDetail(selectedTopicId.value, item.id)
|
||||
}
|
||||
|
||||
onMounted(async () => {
|
||||
await topicStore.fetchTopics()
|
||||
if (bookStore.books.length === 0) {
|
||||
await bookStore.fetchBooks()
|
||||
}
|
||||
})
|
||||
|
||||
async function handleGenerate(topicId: string) {
|
||||
@@ -109,27 +224,122 @@ async function handleGenerate(topicId: string) {
|
||||
summaryError.value.toLowerCase().includes('no books') ||
|
||||
summaryError.value.toLowerCase().includes('knowledge source')
|
||||
showToast?.(summaryError.value, 'error')
|
||||
} else {
|
||||
// Refresh the history list to include the newly saved summary
|
||||
await topicStore.fetchSummaries(topicId)
|
||||
}
|
||||
}
|
||||
|
||||
function formatDate(iso: string): string {
|
||||
return new Date(iso).toLocaleString()
|
||||
}
|
||||
|
||||
function formatDateShort(iso: string): string {
|
||||
return new Date(iso).toLocaleDateString(undefined, { month: 'short', day: 'numeric' })
|
||||
}
|
||||
</script>
|
||||
|
||||
<style scoped>
|
||||
.topics-layout {
|
||||
display: flex;
|
||||
gap: 2rem;
|
||||
}
|
||||
|
||||
.topics-main {
|
||||
flex: 1;
|
||||
min-width: 0;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 2rem;
|
||||
}
|
||||
|
||||
.reader-panel {
|
||||
margin-top: 1rem;
|
||||
height: 600px;
|
||||
min-height: 400px;
|
||||
border-radius: 10px;
|
||||
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.07);
|
||||
}
|
||||
|
||||
.topic-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
|
||||
gap: 1rem;
|
||||
}
|
||||
|
||||
/* History panel */
|
||||
.history-panel {
|
||||
border-top: 3px solid #805ad5;
|
||||
padding: 1rem;
|
||||
}
|
||||
|
||||
.history-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
margin-bottom: 0.75rem;
|
||||
}
|
||||
|
||||
.history-title {
|
||||
font-size: 0.875rem;
|
||||
font-weight: 600;
|
||||
color: #553c9a;
|
||||
}
|
||||
|
||||
.btn-sm {
|
||||
font-size: 0.8rem;
|
||||
padding: 0.3rem 0.75rem;
|
||||
}
|
||||
|
||||
.history-loading {
|
||||
font-size: 0.85rem;
|
||||
color: #718096;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
}
|
||||
|
||||
.history-empty {
|
||||
font-size: 0.85rem;
|
||||
color: #a0aec0;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.history-list {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
.history-chip {
|
||||
background: #faf5ff;
|
||||
border: 1px solid #d6bcfa;
|
||||
border-radius: 6px;
|
||||
padding: 0.3rem 0.75rem;
|
||||
font-size: 0.8rem;
|
||||
font-weight: 500;
|
||||
color: #553c9a;
|
||||
cursor: pointer;
|
||||
transition: background 0.15s, border-color 0.15s;
|
||||
}
|
||||
|
||||
.history-chip:hover {
|
||||
background: #e9d8fd;
|
||||
border-color: #b794f4;
|
||||
}
|
||||
|
||||
.history-chip--active {
|
||||
background: #805ad5;
|
||||
border-color: #805ad5;
|
||||
color: #fff;
|
||||
}
|
||||
|
||||
.history-chip-date {
|
||||
font-weight: 400;
|
||||
color: inherit;
|
||||
opacity: 0.75;
|
||||
}
|
||||
|
||||
/* Summary panel */
|
||||
.summary-panel {
|
||||
border-top: 3px solid #3182ce;
|
||||
}
|
||||
@@ -170,6 +380,22 @@ function formatDate(iso: string): string {
|
||||
color: #1a365d;
|
||||
}
|
||||
|
||||
.summary-meta {
|
||||
display: flex;
|
||||
align-items: baseline;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
.summary-number {
|
||||
font-size: 0.8rem;
|
||||
font-weight: 600;
|
||||
color: #805ad5;
|
||||
background: #faf5ff;
|
||||
border: 1px solid #d6bcfa;
|
||||
border-radius: 4px;
|
||||
padding: 0.1rem 0.4rem;
|
||||
}
|
||||
|
||||
.summary-timestamp {
|
||||
font-size: 0.8rem;
|
||||
color: #a0aec0;
|
||||
@@ -179,10 +405,60 @@ function formatDate(iso: string): string {
|
||||
font-size: 0.95rem;
|
||||
line-height: 1.7;
|
||||
color: #2d3748;
|
||||
white-space: pre-wrap;
|
||||
margin-bottom: 1rem;
|
||||
}
|
||||
|
||||
.summary-text--markdown {
|
||||
white-space: normal;
|
||||
}
|
||||
|
||||
.summary-text--markdown :deep(h1),
|
||||
.summary-text--markdown :deep(h2),
|
||||
.summary-text--markdown :deep(h3),
|
||||
.summary-text--markdown :deep(h4) {
|
||||
font-weight: 700;
|
||||
margin: 0.75rem 0 0.35rem;
|
||||
line-height: 1.3;
|
||||
color: #1a202c;
|
||||
}
|
||||
|
||||
.summary-text--markdown :deep(h1) { font-size: 1.15rem; }
|
||||
.summary-text--markdown :deep(h2) { font-size: 1.05rem; }
|
||||
.summary-text--markdown :deep(h3) { font-size: 0.975rem; }
|
||||
.summary-text--markdown :deep(h4) { font-size: 0.925rem; }
|
||||
|
||||
.summary-text--markdown :deep(p) { margin: 0.4rem 0; }
|
||||
|
||||
.summary-text--markdown :deep(ul),
|
||||
.summary-text--markdown :deep(ol) {
|
||||
padding-left: 1.4rem;
|
||||
margin: 0.4rem 0;
|
||||
}
|
||||
|
||||
.summary-text--markdown :deep(li) { margin: 0.2rem 0; }
|
||||
|
||||
.summary-text--markdown :deep(strong) {
|
||||
font-weight: 700;
|
||||
color: #1a202c;
|
||||
}
|
||||
|
||||
.summary-text--markdown :deep(em) { font-style: italic; }
|
||||
|
||||
.summary-text--markdown :deep(code) {
|
||||
background: #edf2f7;
|
||||
border-radius: 3px;
|
||||
padding: 0.1em 0.35em;
|
||||
font-size: 0.87em;
|
||||
font-family: monospace;
|
||||
}
|
||||
|
||||
.summary-text--markdown :deep(blockquote) {
|
||||
border-left: 3px solid #bee3f8;
|
||||
margin: 0.5rem 0;
|
||||
padding: 0.25rem 0.75rem;
|
||||
color: #4a5568;
|
||||
}
|
||||
|
||||
.sources-section {
|
||||
border-top: 1px solid #e2e8f0;
|
||||
padding-top: 0.75rem;
|
||||
@@ -223,6 +499,20 @@ function formatDate(iso: string): string {
|
||||
font-size: 0.8rem;
|
||||
}
|
||||
|
||||
.source-chip--clickable {
|
||||
cursor: pointer;
|
||||
transition: background 0.15s, border-color 0.15s;
|
||||
}
|
||||
|
||||
.source-chip--clickable:hover {
|
||||
background: #bee3f8;
|
||||
border-color: #90cdf4;
|
||||
}
|
||||
|
||||
.source-icon {
|
||||
font-size: 0.8rem;
|
||||
}
|
||||
|
||||
.source-book {
|
||||
color: #2b6cb0;
|
||||
font-weight: 500;
|
||||
@@ -232,6 +522,12 @@ function formatDate(iso: string): string {
|
||||
color: #718096;
|
||||
}
|
||||
|
||||
.source-open-hint {
|
||||
font-size: 0.75rem;
|
||||
color: #3182ce;
|
||||
margin-left: 0.1rem;
|
||||
}
|
||||
|
||||
.no-sources {
|
||||
font-size: 0.85rem;
|
||||
color: #a0aec0;
|
||||
@@ -252,4 +548,31 @@ function formatDate(iso: string): string {
|
||||
color: #3182ce;
|
||||
text-decoration: underline;
|
||||
}
|
||||
|
||||
.summary-placeholder {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
min-height: 6rem;
|
||||
border-top-color: #cbd5e0;
|
||||
}
|
||||
|
||||
.summary-placeholder-text {
|
||||
font-size: 0.95rem;
|
||||
color: #a0aec0;
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
.summary-text--markdown :deep(.source-ref) {
|
||||
color: #3182ce;
|
||||
font-weight: 600;
|
||||
cursor: pointer;
|
||||
border-radius: 3px;
|
||||
padding: 0 0.15em;
|
||||
}
|
||||
|
||||
.summary-text--markdown :deep(.source-ref:hover) {
|
||||
background: #ebf8ff;
|
||||
text-decoration: underline;
|
||||
}
|
||||
</style>
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
<template>
|
||||
<div class="upload-view">
|
||||
<h1 class="page-title">Book Library</h1>
|
||||
<p class="page-subtitle">Upload medical textbooks (PDF) to build the knowledge base.</p>
|
||||
<p v-if="uploadEnabled" class="page-subtitle">Upload medical textbooks (PDF) to build the knowledge base.</p>
|
||||
|
||||
<!-- Upload Section -->
|
||||
<div class="upload-section card">
|
||||
<div v-if="uploadEnabled" class="upload-section card">
|
||||
<h2 class="section-title">Upload a Book</h2>
|
||||
|
||||
<div
|
||||
@@ -87,6 +87,7 @@
|
||||
:key="book.id"
|
||||
:book="book"
|
||||
:deleting="deletingId === book.id"
|
||||
:delete-enabled="deleteEnabled"
|
||||
@delete="handleDelete"
|
||||
/>
|
||||
</div>
|
||||
@@ -98,6 +99,10 @@
|
||||
import { ref, onMounted, onUnmounted, inject } from 'vue'
|
||||
import { useBookStore } from '@/stores/bookStore'
|
||||
import BookCard from '@/components/BookCard.vue'
|
||||
import { env } from '@/env'
|
||||
|
||||
const uploadEnabled = env('VITE_UPLOAD_ENABLED') !== 'false'
|
||||
const deleteEnabled = env('VITE_DELETE_ENABLED') !== 'false'
|
||||
|
||||
const bookStore = useBookStore()
|
||||
const showToast = inject<(msg: string, type?: 'error' | 'success') => void>('showToast')
|
||||
|
||||
@@ -0,0 +1,73 @@
|
||||
# Embedding & Retrieval Pipeline Checklist: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Purpose**: Author self-review of embedding pipeline and retrieval requirements quality — validates completeness, clarity, and measurability before implementation tasks are written
|
||||
**Created**: 2026-04-03
|
||||
**Feature**: [spec.md](../spec.md) | [research.md](../research.md) | [data-model.md](../data-model.md)
|
||||
**Focus**: A (Embedding pipeline) + B (Retrieval & ranking) | Depth: Standard | Audience: Author
|
||||
|
||||
---
|
||||
|
||||
## Requirement Completeness — Embedding Pipeline
|
||||
|
||||
- [X] CHK001 - Is the definition of "inspect every page" complete — does the spec cover pages that have no extractable content layer (fully scanned/rasterised pages)? Yes [Completeness, Spec §FR-001, Assumption §6]
|
||||
|
||||
- [X] CHK002 - Does FR-002 define what "independently searchable" means in practice — specifically, is it clear that image chunks must be retrievable without a co-located text chunk? [Clarity, Spec §FR-002] - No image should be retrieved along linked text.
|
||||
|
||||
- [X] CHK003 - Is the minimum acceptable quality of the "descriptive textual representation" (FR-003) specified — e.g., must it include structural relationships, labelled regions, or clinical terms — or is any non-empty description sufficient? [Clarity, Spec §FR-003, Gap] - any non-empty description sufficient. Text just below the image should have the correct clinical term.
|
||||
|
||||
- [C] CHK004 - Are the caption-detection rules defined at spec level — specifically, what pattern or signal determines that a piece of text is a caption vs. body text adjacent to an image? [Clarity, Spec §FR-004, Gap] - We assume a text starting with Fig. follewed by number is a text description of a give image.
|
||||
|
||||
- [X] CHK005 - Does FR-004 specify what metadata is stored when a caption is absent — is the caption field omitted, left empty, or populated with a generated substitute? [Completeness, Spec §FR-004] - generated substitute
|
||||
|
||||
- [X] CHK006 - Is the "minimum meaningful-content threshold" (FR-007) quantified in the spec, or is it deferred entirely to implementation? The assumption section says "size threshold determined during implementation" — is this intentional and acceptable at the spec level? [Ambiguity, Spec §FR-007, Assumption §3] - Deferred to implementation
|
||||
|
||||
- [X] CHK007 - Does FR-008 specify the observable outcome of per-page image failures — specifically, is there a requirement that the book's processing status or error log is accessible to the user or admin after partial failure? [Completeness, Spec §FR-008, Gap] online logs
|
||||
|
||||
- [X] CHK008 - Is FR-010 ("MUST NOT degrade accuracy or completeness of text-only embedding") measurable — does the spec define a baseline or acceptance criterion against which degradation can be detected? [Measurability, Spec §FR-010, Gap] no definition
|
||||
|
||||
- [X] CHK009 - Are re-embedding requirements complete — does the spec cover what happens to in-progress queries and cached results while a book is being re-embedded? [Coverage, Assumption §8, Gap] - No need to take that into account.
|
||||
|
||||
---
|
||||
|
||||
## Requirement Completeness — Retrieval & Ranking
|
||||
|
||||
- [X] CHK010 - Does FR-006 define how image and text chunks are ranked relative to each other — is ranking unified (single score), or are the two modalities ranked independently with separate topK controls? [Clarity, Spec §FR-006, Gap] - independent separated topK
|
||||
|
||||
- [X] CHK011 - Is the relevance threshold for figure retrieval specified — i.e., at what similarity score (or other criterion) should a figure be excluded from results? [Clarity, Spec §FR-006, Gap] not specified
|
||||
|
||||
- [X] CHK012 - Are deduplication rules defined for the case where the same figure appears both in the semantic figure search and the chunk-to-figure reference lookup — which representation wins, or are both included? [Completeness, data-model.md §RetrievalResult, Gap] not specified
|
||||
|
||||
- [X] CHK013 - Is the requirement for parent section context expansion in the spec — specifically, is there a requirement that the LLM receives the full section text (not just the chunk) when a text chunk is retrieved? [Gap, research.md §Decision 1] - the LLM should receive the full section to have maximum context.
|
||||
|
||||
- [X] CHK014 - Does the spec define the required structure of the LLM prompt when both text context and figures are present — or is prompt design left entirely to implementation? [Completeness, Gap] - Left to implementation
|
||||
|
||||
- [X] CHK015 - Is SC-002 ("70% recall on image queries") sufficient as a measurability criterion — is the test set composition (10 queries) and evaluation method documented, or does it rely on an undefined manual process? [Measurability, Spec §SC-002] - Manual process.
|
||||
|
||||
---
|
||||
|
||||
## Scenario Coverage — Edge & Exception Cases
|
||||
|
||||
- [X] CHK016 - Does the spec address the scenario where a query is relevant to a book section that has figures but none of those figures rank above the retrieval threshold — is the expected fallback behaviour defined? [Coverage, Edge Case, Gap] - The figure should in this case be retrieved and shon to the user.
|
||||
|
||||
- [X] CHK017 - Is the scenario of a figure retrieved in search results but whose image file is missing from the file store covered — what should the system return to the user in that case? [Coverage, Exception Flow, Gap] - missing image error, shown in the front as a broken image link.
|
||||
|
||||
- [X] CHK018 - Are requirements defined for multi-image pages where images have conflicting captions or share a single composite caption — which image gets the caption, or is it duplicated? [Coverage, Spec §FR-004, Edge Case] - this case not exist.
|
||||
|
||||
---
|
||||
|
||||
## Consistency & Alignment
|
||||
|
||||
- [X] CHK019 - Are the metadata fields required by FR-004 and FR-005 fully consistent with the metadata schema defined in data-model.md — specifically, do the mandatory fields in the spec match the `type`, `section_id`, and `section_title` fields in the data model? [Consistency, Spec §FR-004, data-model.md §Vector Store Documents] - Left to implementation
|
||||
|
||||
- [X] CHK020 - Is SC-003 ("processing time ≤ 3× baseline") consistent with FR-003 — if description generation requires a vision model call per image, is the 3× cap realistic for a 500-page book with dense figures, and is this assumption documented? [Consistency, Spec §SC-003, Assumption §3, Gap] - not documented
|
||||
|
||||
- [X] CHK021 - Does the spec's description of citation display (FR-009) align with the `sources` format change documented in contracts/api.md — are the fields the spec says must be "distinct" actually represented distinctly in the API response? [Consistency, Spec §FR-009, contracts/api.md §4] - A section with image-source should be displayed in the front. Text source and image-source are distinct
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- Items marked `[Gap]` indicate requirements that appear absent or deferred; resolve before generating tasks
|
||||
- Items marked `[Ambiguity]` require a clearer definition in the spec before implementation starts
|
||||
- Items marked `[Consistency]` should be cross-checked between spec.md, data-model.md, and contracts/api.md
|
||||
- Mark items `[x]` when resolved; add inline notes with the resolution for traceability
|
||||
@@ -0,0 +1,34 @@
|
||||
# Specification Quality Checklist: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
||||
**Created**: 2026-04-03
|
||||
**Feature**: [spec.md](../spec.md)
|
||||
|
||||
## Content Quality
|
||||
|
||||
- [x] No implementation details (languages, frameworks, APIs)
|
||||
- [x] Focused on user value and business needs
|
||||
- [x] Written for non-technical stakeholders
|
||||
- [x] All mandatory sections completed
|
||||
|
||||
## Requirement Completeness
|
||||
|
||||
- [x] No [NEEDS CLARIFICATION] markers remain
|
||||
- [x] Requirements are testable and unambiguous
|
||||
- [x] Success criteria are measurable
|
||||
- [x] Success criteria are technology-agnostic (no implementation details)
|
||||
- [x] All acceptance scenarios are defined
|
||||
- [x] Edge cases are identified
|
||||
- [x] Scope is clearly bounded
|
||||
- [x] Dependencies and assumptions identified
|
||||
|
||||
## Feature Readiness
|
||||
|
||||
- [x] All functional requirements have clear acceptance criteria
|
||||
- [x] User scenarios cover primary flows
|
||||
- [x] Feature meets measurable outcomes defined in Success Criteria
|
||||
- [x] No implementation details leak into specification
|
||||
|
||||
## Notes
|
||||
|
||||
- All items pass. Spec is ready for `/speckit.clarify` or `/speckit.plan`.
|
||||
@@ -0,0 +1,172 @@
|
||||
# API Contracts: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03
|
||||
**Base path**: `/api/v1`
|
||||
**Auth**: HTTP Basic (existing)
|
||||
|
||||
---
|
||||
|
||||
## New / Changed Endpoints
|
||||
|
||||
### 1. Re-embed a book (new)
|
||||
|
||||
Triggers a full re-embedding of an already-processed book, replacing all existing chunks and
|
||||
figures with the new image-aware pipeline output. Safe to call on books previously embedded
|
||||
by feature 001.
|
||||
|
||||
```
|
||||
POST /api/v1/books/{id}/reembed
|
||||
```
|
||||
|
||||
**Path parameters**
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `id` | UUID | Book ID |
|
||||
|
||||
**Response** `202 Accepted`
|
||||
|
||||
```json
|
||||
{ "bookId": "uuid", "status": "PROCESSING" }
|
||||
```
|
||||
|
||||
**Error responses**
|
||||
|
||||
| Status | Condition |
|
||||
|--------|-----------|
|
||||
| 404 | Book not found |
|
||||
| 409 | Book already in PROCESSING state |
|
||||
|
||||
---
|
||||
|
||||
### 2. Get figures for a book (new)
|
||||
|
||||
Returns the list of extracted figures for a book, including their type, caption, and image URL.
|
||||
Used by the frontend to display a figure gallery or inline figures in chat responses.
|
||||
|
||||
```
|
||||
GET /api/v1/books/{id}/figures
|
||||
```
|
||||
|
||||
**Path parameters**
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `id` | UUID | Book ID |
|
||||
|
||||
**Response** `200 OK`
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"figureId": "youmans-7ed-fig-12-4",
|
||||
"label": "Fig. 12-4",
|
||||
"caption": "Coronal cross-section of the cavernous sinus showing cranial nerve relationships",
|
||||
"figureType": "ANATOMICAL_DIAGRAM",
|
||||
"page": 184,
|
||||
"imageUrl": "/api/v1/figures/550e8400-e29b-41d4-a716-446655440000/youmans-7ed-fig-12-4.png",
|
||||
"sectionId": "youmans-7ed-ch12-s2-3",
|
||||
"sectionTitle": "Cavernous Sinus"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
**Error responses**
|
||||
|
||||
| Status | Condition |
|
||||
|--------|-----------|
|
||||
| 404 | Book not found |
|
||||
|
||||
---
|
||||
|
||||
### 3. Serve figure image (new)
|
||||
|
||||
Serves the extracted figure image file. Mounted as a static resource from the file store.
|
||||
|
||||
```
|
||||
GET /api/v1/figures/{bookId}/{filename}
|
||||
```
|
||||
|
||||
**Path parameters**
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `bookId` | UUID | Book ID |
|
||||
| `filename` | string | Image filename (e.g. `youmans-7ed-fig-12-4.png`) |
|
||||
|
||||
**Response** `200 OK` — binary PNG
|
||||
**Content-Type**: `image/png`
|
||||
|
||||
**Error responses**
|
||||
|
||||
| Status | Condition |
|
||||
|--------|-----------|
|
||||
| 404 | Image file not found |
|
||||
|
||||
---
|
||||
|
||||
### 4. Chat message response — extended source format (changed)
|
||||
|
||||
The existing `POST /api/v1/chat/sessions/{id}/messages` endpoint is unchanged in its request
|
||||
format. The response `sources` field is extended to include figure references.
|
||||
|
||||
**Existing request** (unchanged):
|
||||
|
||||
```json
|
||||
{ "content": "Describe the anatomy of the cavernous sinus" }
|
||||
```
|
||||
|
||||
**Response** `200 OK` — extended `sources`:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "uuid",
|
||||
"role": "ASSISTANT",
|
||||
"content": "The cavernous sinus is ... [Fig. 12-4, p.184] ...",
|
||||
"sources": [
|
||||
{
|
||||
"type": "TEXT",
|
||||
"bookTitle": "Youmans and Winn Neurological Surgery, 7th Ed.",
|
||||
"page": 184,
|
||||
"chunkText": "The cavernous sinus contains ..."
|
||||
},
|
||||
{
|
||||
"type": "FIGURE",
|
||||
"bookTitle": "Youmans and Winn Neurological Surgery, 7th Ed.",
|
||||
"page": 184,
|
||||
"figureId": "youmans-7ed-fig-12-4",
|
||||
"label": "Fig. 12-4",
|
||||
"caption": "Coronal cross-section of the cavernous sinus ...",
|
||||
"figureType": "ANATOMICAL_DIAGRAM",
|
||||
"imageUrl": "/api/v1/figures/550e8400-e29b-41d4-a716-446655440000/youmans-7ed-fig-12-4.png"
|
||||
}
|
||||
],
|
||||
"createdAt": "2026-04-03T12:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**Changed fields in `sources` array**:
|
||||
|
||||
| Field | Old | New |
|
||||
|-------|-----|-----|
|
||||
| `type` | absent | `"TEXT"` or `"FIGURE"` |
|
||||
| `figureId` | absent | figure ID string (FIGURE type only) |
|
||||
| `label` | absent | caption label (FIGURE type only) |
|
||||
| `caption` | absent | full caption (FIGURE type only) |
|
||||
| `figureType` | absent | enum name (FIGURE type only) |
|
||||
| `imageUrl` | absent | image URL (FIGURE type only) |
|
||||
|
||||
---
|
||||
|
||||
## Unchanged Endpoints
|
||||
|
||||
All endpoints from feature 001 remain at their existing paths with no breaking changes:
|
||||
|
||||
- `POST /api/v1/books/upload`
|
||||
- `GET /api/v1/books`
|
||||
- `DELETE /api/v1/books/{id}`
|
||||
- `GET /api/v1/topics`
|
||||
- `GET /api/v1/topics/{id}/summary`
|
||||
- `POST /api/v1/chat/sessions`
|
||||
- `GET /api/v1/chat/sessions/{id}/messages`
|
||||
- `DELETE /api/v1/chat/sessions/{id}`
|
||||
@@ -0,0 +1,79 @@
|
||||
# Internal Contract: DocumentAiPageParser → FigureExtractionService
|
||||
|
||||
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-04
|
||||
**Type**: Internal Java DTO (not an HTTP contract)
|
||||
|
||||
---
|
||||
|
||||
## Purpose
|
||||
|
||||
`PageResult` is the internal data transfer object produced by `DocumentAiPageParser` for each
|
||||
PDF page. It decouples the Google Document AI SDK types from the rest of the pipeline so that
|
||||
`PdfStructureParser` can be replaced without cascading changes.
|
||||
|
||||
---
|
||||
|
||||
## Java Record
|
||||
|
||||
```java
|
||||
package com.aiteacher.document;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Internal DTO produced by DocumentAiPageParser for one PDF page.
|
||||
* Decouples the Document AI SDK types from downstream services.
|
||||
*/
|
||||
public record PageResult(
|
||||
int pageNumber, // 1-based, matches Document.Page.getPageNumber()
|
||||
String orderedText, // full page text in correct reading order (blocks joined by \n\n)
|
||||
String headingTitle, // first HEADING block on page, or null
|
||||
List<FigureBbox> figures // detected figure regions (may be empty)
|
||||
) {
|
||||
|
||||
/**
|
||||
* Normalized bounding box for a detected figure region.
|
||||
* Coordinates are in the [0.0, 1.0] range relative to page dimensions.
|
||||
*/
|
||||
public record FigureBbox(
|
||||
float x, // left edge (normalized)
|
||||
float y, // top edge (normalized)
|
||||
float width, // width (normalized)
|
||||
float height, // height (normalized)
|
||||
String nearestCaption // text of adjacent paragraph block, or null
|
||||
) {}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Rules
|
||||
|
||||
| Field | Rule |
|
||||
|-------|------|
|
||||
| `orderedText` | Concatenation of all `PARAGRAPH` and `HEADING_*` blocks, joined with `\n\n`. Tables are represented as tab-separated text. |
|
||||
| `headingTitle` | First block whose `blockType` is `HEADING_1` through `HEADING_6`. `null` if no heading detected. |
|
||||
| `figures` | One entry per `VisualElement` with `type == "figure"` and `confidence ≥ 0.5`. Sorted top-to-bottom by `y`. |
|
||||
| `nearestCaption` | The `PARAGRAPH` block immediately following the figure bbox (by Y coordinate). May be `null` if no paragraph follows within 10% of page height. |
|
||||
|
||||
---
|
||||
|
||||
## Mapping from Document AI Proto
|
||||
|
||||
```
|
||||
Document.Page.Block → orderedText (concatenated)
|
||||
Document.Page.Block (HEADING_*) → headingTitle (first match)
|
||||
Document.Page.VisualElement → FigureBbox
|
||||
└─ layout.bounding_poly.normalized_vertices[0] → (x, y) top-left
|
||||
└─ normalized_vertices[2] → (x+w, y+h) bottom-right
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consumers
|
||||
|
||||
| Consumer | What It Uses |
|
||||
|----------|-------------|
|
||||
| `BookEmbeddingService` | `orderedText` → `SectionEntity.fullText`; `headingTitle` → `SectionEntity.title` |
|
||||
| `FigureExtractionService` | `figures` list → renders page via PDFBox, crops each bbox to `BufferedImage` |
|
||||
| `TextChunkingService` | Receives `SectionEntity` (indirectly uses `orderedText`) — **unchanged** |
|
||||
@@ -0,0 +1,84 @@
|
||||
# Internal Contract: MarkerPageParser → FigureExtractionService / BookEmbeddingService
|
||||
|
||||
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-04
|
||||
**Type**: Internal Java DTO (not an HTTP contract)
|
||||
|
||||
---
|
||||
|
||||
## Purpose
|
||||
|
||||
`PageResult` is the internal data transfer object produced by `MarkerPageParser` for each
|
||||
PDF page. It decouples the Marker HTTP API from the rest of the pipeline. Downstream consumers
|
||||
(`BookEmbeddingService`, `FigureExtractionService`, `TextChunkingService`) are unaware of
|
||||
Marker and depend only on this DTO.
|
||||
|
||||
---
|
||||
|
||||
## Java Record
|
||||
|
||||
```java
|
||||
package com.aiteacher.document;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Internal DTO produced by MarkerPageParser for one PDF page.
|
||||
* Decouples the Marker HTTP API from downstream services.
|
||||
*/
|
||||
public record PageResult(
|
||||
int pageNumber, // 1-based, derived from Marker page block index
|
||||
String orderedText, // full page text in correct reading order (blocks joined by \n\n)
|
||||
String headingTitle, // first SectionHeader block on page, or null
|
||||
List<FigureData> figures // extracted figure images (may be empty)
|
||||
) {
|
||||
|
||||
/**
|
||||
* A figure extracted from the page.
|
||||
* Image bytes are PNG data decoded from the Marker JSON `images` map.
|
||||
*/
|
||||
public record FigureData(
|
||||
byte[] imageBytes, // PNG image data (base64-decoded from Marker response)
|
||||
String nearestCaption, // text of the adjacent Caption block, or null
|
||||
String blockId // Marker block ID (e.g. "/page/0/Figure/2") for traceability
|
||||
) {}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Production Rules
|
||||
|
||||
| Field | Rule |
|
||||
|-------|------|
|
||||
| `pageNumber` | 1-based index derived from the Marker page block's position in the `children` array (index + 1). |
|
||||
| `orderedText` | HTML-stripped text from all `Text`, `TextInlineMath`, `SectionHeader`, `ListItem`, and `Table` blocks, joined with `\n\n`. Marker already returns them in reading order. |
|
||||
| `headingTitle` | Plain text of the first `SectionHeader` block on the page. `null` if no heading detected. |
|
||||
| `figures` | One `FigureData` per `Figure` or `Picture` block that has a non-empty `images` entry. Blocks with no image data are skipped. |
|
||||
| `imageBytes` | Base64-decoded bytes from `block.images[blockId]`. Marker returns PNG. |
|
||||
| `nearestCaption` | Plain text of the first `Caption` block that is a sibling appearing immediately after the figure block. `null` if absent. |
|
||||
|
||||
---
|
||||
|
||||
## Mapping from Marker JSON
|
||||
|
||||
```
|
||||
Marker JSON → PageResult
|
||||
|
||||
Page block ("/page/N/Page/M") → PageResult(pageNumber = N + 1)
|
||||
SectionHeader child → headingTitle (first match, HTML-stripped)
|
||||
Text / TextInlineMath children → orderedText (HTML-stripped, joined \n\n)
|
||||
Figure / Picture child → FigureData
|
||||
images[blockId] → FigureData.imageBytes (base64-decoded)
|
||||
next Caption sibling → FigureData.nearestCaption (HTML-stripped)
|
||||
blockId → FigureData.blockId
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consumers
|
||||
|
||||
| Consumer | What It Uses |
|
||||
|----------|-------------|
|
||||
| `BookEmbeddingService` | `orderedText` → `SectionEntity.fullText`; `headingTitle` → `SectionEntity.title` |
|
||||
| `FigureExtractionService` | `figures` list → decodes `imageBytes`, checks min size, saves to S3 |
|
||||
| `TextChunkingService` | Receives `SectionEntity` (uses `orderedText` indirectly) — **unchanged** |
|
||||
@@ -0,0 +1,305 @@
|
||||
# Data Model: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-03
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Three storage tiers work in concert:
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────┐
|
||||
│ PDF Upload │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ Parsing Pipeline │
|
||||
│ │ │ │
|
||||
│ ▼ ▼ │
|
||||
│ Postgres (source of truth) pgvector (search index) │
|
||||
│ - book - vector_store (text chunks) │
|
||||
│ - chapter - vector_store (figure captions) │
|
||||
│ - section (+ fullText) File Store (images) │
|
||||
│ - figure (metadata) - /uploads/figures/{bookId}/*.png │
|
||||
│ - chunk_figure_refs │
|
||||
└──────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Postgres Schema
|
||||
|
||||
### Existing tables (unchanged)
|
||||
|
||||
- `book` — status, metadata, page count (V1)
|
||||
- `chat_session`, `message` — conversation (V1)
|
||||
- `vector_store` — managed by Spring AI pgvector starter (V2)
|
||||
- `topic` — predefined topics (V3)
|
||||
|
||||
### New tables (Flyway V4)
|
||||
|
||||
```sql
|
||||
-- V4: Document hierarchy
|
||||
|
||||
CREATE TABLE chapter (
|
||||
id VARCHAR(200) PRIMARY KEY, -- "{bookId}-ch{N}"
|
||||
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||
number INT NOT NULL,
|
||||
title VARCHAR(500),
|
||||
page_start INT,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE TABLE section (
|
||||
id VARCHAR(200) PRIMARY KEY, -- "{bookId}-ch{N}-s{X}-{Y}"
|
||||
chapter_id VARCHAR(200) NOT NULL REFERENCES chapter(id) ON DELETE CASCADE,
|
||||
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||
number VARCHAR(50), -- "2.3" or "12.2.3"
|
||||
title VARCHAR(500),
|
||||
page_start INT NOT NULL,
|
||||
page_end INT NOT NULL,
|
||||
full_text TEXT NOT NULL, -- NOT in vector store
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE INDEX idx_section_book ON section(book_id);
|
||||
CREATE INDEX idx_section_chapter ON section(chapter_id);
|
||||
```
|
||||
|
||||
### New tables (Flyway V5)
|
||||
|
||||
```sql
|
||||
-- V5: Figures and chunk→figure links
|
||||
|
||||
CREATE TABLE figure (
|
||||
id VARCHAR(200) PRIMARY KEY, -- "{bookId}-fig-{label}"
|
||||
book_id UUID NOT NULL REFERENCES book(id) ON DELETE CASCADE,
|
||||
section_id VARCHAR(200) REFERENCES section(id) ON DELETE SET NULL,
|
||||
chapter_id VARCHAR(200) REFERENCES chapter(id) ON DELETE SET NULL,
|
||||
label VARCHAR(100), -- "Fig. 12-4"
|
||||
caption TEXT,
|
||||
figure_type VARCHAR(50) NOT NULL, -- FigureType enum name
|
||||
page INT NOT NULL,
|
||||
image_path VARCHAR(1000) NOT NULL, -- relative path on disk
|
||||
caption_embedding_id UUID, -- ID in vector_store
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
CREATE TABLE chunk_figure_ref (
|
||||
chunk_id UUID NOT NULL, -- vector_store document ID
|
||||
figure_id VARCHAR(200) NOT NULL REFERENCES figure(id) ON DELETE CASCADE,
|
||||
mention_page INT,
|
||||
PRIMARY KEY (chunk_id, figure_id)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_figure_book ON figure(book_id);
|
||||
CREATE INDEX idx_cfr_chunk ON chunk_figure_ref(chunk_id);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Java Domain Records
|
||||
|
||||
### Document hierarchy (new package `com.aiteacher.document`)
|
||||
|
||||
```java
|
||||
// Root — in-memory only, not a JPA entity
|
||||
public record BookNode(
|
||||
String bookId,
|
||||
String title,
|
||||
String isbn,
|
||||
String edition,
|
||||
List<String> authors,
|
||||
List<ChapterNode> chapters
|
||||
) {}
|
||||
|
||||
// Chapter — maps to `chapter` table
|
||||
public record ChapterNode(
|
||||
String chapterId,
|
||||
String bookId,
|
||||
int number,
|
||||
String title,
|
||||
int pageStart,
|
||||
List<SectionNode> sections
|
||||
) {}
|
||||
|
||||
// Section — maps to `section` table; fullText stays in Postgres
|
||||
public record SectionNode(
|
||||
String sectionId,
|
||||
String chapterId,
|
||||
String bookId,
|
||||
String number,
|
||||
String title,
|
||||
int pageStart,
|
||||
int pageEnd,
|
||||
String fullText,
|
||||
List<TextChunkNode> chunks,
|
||||
List<FigureNode> figures
|
||||
) {}
|
||||
|
||||
// Text chunk — embedded into vector_store; references its parent section
|
||||
public record TextChunkNode(
|
||||
String chunkId, // UUID → becomes vector_store document ID
|
||||
String sectionId,
|
||||
String chapterId,
|
||||
String bookId,
|
||||
String text,
|
||||
int chunkIndex,
|
||||
int totalChunksInSection,
|
||||
int pageStart,
|
||||
int pageEnd,
|
||||
Map<String, Object> metadata // flattened for Spring AI filtering
|
||||
) {
|
||||
public Map<String, Object> toMetadata() {
|
||||
return Map.of(
|
||||
"type", "TEXT",
|
||||
"book_id", bookId,
|
||||
"chapter_id", chapterId,
|
||||
"section_id", sectionId,
|
||||
"section_title", /* from parent SectionNode */,
|
||||
"page_start", pageStart,
|
||||
"page_end", pageEnd,
|
||||
"chunk_index", chunkIndex,
|
||||
"total_chunks", totalChunksInSection
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Figure — maps to `figure` table; caption embedded into vector_store
|
||||
public record FigureNode(
|
||||
String figureId,
|
||||
String sectionId,
|
||||
String chapterId,
|
||||
String bookId,
|
||||
String label, // "Fig. 12-4"
|
||||
String caption,
|
||||
FigureType type,
|
||||
int page,
|
||||
String imagePath, // relative: "figures/{bookId}/{figureId}.png"
|
||||
UUID captionEmbeddingId // ID in vector_store
|
||||
) {}
|
||||
```
|
||||
|
||||
### Figure type enum
|
||||
|
||||
```java
|
||||
public enum FigureType {
|
||||
ANATOMICAL_DIAGRAM,
|
||||
SURGICAL_PHOTOGRAPH,
|
||||
MRI_CT_SCAN,
|
||||
TABLE,
|
||||
CHART,
|
||||
INTRAOPERATIVE_IMAGE
|
||||
}
|
||||
```
|
||||
|
||||
Classification heuristic (applied to caption + surrounding text):
|
||||
|
||||
| Keyword(s) | FigureType |
|
||||
|-----------|-----------|
|
||||
| `MRI`, `CT`, `magnetic`, `resonance`, `tomography` | `MRI_CT_SCAN` |
|
||||
| `intraoperative`, `intra-op` | `INTRAOPERATIVE_IMAGE` |
|
||||
| `table`, `Table` (at line start) | `TABLE` |
|
||||
| `chart`, `graph`, `histogram` | `CHART` |
|
||||
| `photograph`, `photo` | `SURGICAL_PHOTOGRAPH` |
|
||||
| (default) | `ANATOMICAL_DIAGRAM` |
|
||||
|
||||
### Chunk–figure join record
|
||||
|
||||
```java
|
||||
// Maps to `chunk_figure_ref` table
|
||||
public record ChunkFigureRef(
|
||||
UUID chunkId,
|
||||
String figureId,
|
||||
int mentionPage
|
||||
) {}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Vector Store Documents
|
||||
|
||||
All documents in `vector_store` carry a `metadata` JSON column with a `type` field for filtering.
|
||||
|
||||
### Text chunk document
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| `content` | chunk text (400–600 tokens) |
|
||||
| `metadata.type` | `"TEXT"` |
|
||||
| `metadata.book_id` | book UUID |
|
||||
| `metadata.book_title` | book title string |
|
||||
| `metadata.chapter_id` | chapter ID string |
|
||||
| `metadata.section_id` | section ID string |
|
||||
| `metadata.section_title` | section title string |
|
||||
| `metadata.page_start` | int |
|
||||
| `metadata.page_end` | int |
|
||||
| `metadata.chunk_index` | int (0-based) |
|
||||
| `metadata.total_chunks` | int |
|
||||
|
||||
### Figure caption document
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| `content` | vision-generated description + caption text |
|
||||
| `metadata.type` | `"FIGURE"` |
|
||||
| `metadata.book_id` | book UUID |
|
||||
| `metadata.book_title` | book title string |
|
||||
| `metadata.chapter_id` | chapter ID string |
|
||||
| `metadata.section_id` | section ID string |
|
||||
| `metadata.figure_id` | figure ID string |
|
||||
| `metadata.figure_type` | enum name string |
|
||||
| `metadata.image_path` | relative file path |
|
||||
| `metadata.label` | caption label e.g. `"Fig. 12-4"` |
|
||||
| `metadata.page` | int |
|
||||
|
||||
---
|
||||
|
||||
## File Store Layout
|
||||
|
||||
```
|
||||
uploads/
|
||||
└── figures/
|
||||
└── {bookId}/
|
||||
├── {figureId}.png
|
||||
└── ...
|
||||
```
|
||||
|
||||
- Base path configurable via `app.figure-storage.base-path` (default: `./uploads`)
|
||||
- Files are served via `GET /api/v1/figures/{bookId}/{filename}` (static resource mapping)
|
||||
- Gitignored; not version-controlled
|
||||
|
||||
---
|
||||
|
||||
## State Transitions
|
||||
|
||||
Book processing extends the existing `BookStatus` state machine:
|
||||
|
||||
```
|
||||
PENDING → PROCESSING → READY
|
||||
↘ FAILED
|
||||
```
|
||||
|
||||
During `PROCESSING`:
|
||||
1. Parse PDF structure → extract chapters/sections → persist to Postgres
|
||||
2. Split sections into text chunks → embed → write to vector_store
|
||||
3. Extract images per page → filter by min size → save PNG → generate vision description → embed caption → write figure to Postgres + vector_store
|
||||
4. Write chunk_figure_refs for all detected figure references in text
|
||||
|
||||
Failure at step 3 (individual page) → log + skip that page's images; continue.
|
||||
Failure at any other step → set `BookStatus.FAILED`.
|
||||
|
||||
---
|
||||
|
||||
## Retrieval Result Structure
|
||||
|
||||
```java
|
||||
public record RetrievalResult(
|
||||
List<SectionNode> parentSections, // expanded full-text context
|
||||
List<Document> figureVectorHits, // semantic figure matches
|
||||
List<FigureNode> linkedFigures // figures explicitly referenced in text chunks
|
||||
) {}
|
||||
```
|
||||
|
||||
The `NeurosurgeryRetriever` service deduplicates figures across both lists before passing
|
||||
the result to the LLM prompt builder.
|
||||
@@ -0,0 +1,85 @@
|
||||
# Implementation Plan: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-04 | **Spec**: [spec.md](spec.md)
|
||||
**Input**: Feature specification from `/specs/002-image-aware-embedding/spec.md`
|
||||
|
||||
## Summary
|
||||
|
||||
Enhance the PDF embedding pipeline to extract figures and generate AI descriptions for them,
|
||||
making image content semantically searchable alongside text. PDF parsing and figure extraction
|
||||
are delegated to a local **Marker** server (`http://localhost:8000/marker/upload`), which
|
||||
returns reading-order text and pre-cropped figure images (base64) in a single JSON response,
|
||||
eliminating the need for PDFBox column heuristics and figure bbox rendering.
|
||||
|
||||
## Technical Context
|
||||
|
||||
**Language/Version**: Java 25 (backend), TypeScript / Node 20 (frontend)
|
||||
**Primary Dependencies**: Spring Boot 4.0.5, Spring AI 2.0.0-M4, OpenAI API (embeddings +
|
||||
GPT-4o vision), PDFBox 3.0.3 (via `spring-ai-pdf-document-reader` — retained transitively,
|
||||
no longer used directly), Marker local HTTP API (`http://localhost:8000/marker/upload`)
|
||||
**Storage**: PostgreSQL (JPA + Flyway), pgvector (Spring AI `VectorStore`), S3-compatible
|
||||
object store (figure images via `FigureStorageService`)
|
||||
**Testing**: Maven / JUnit 5 (`spring-boot-starter-test`)
|
||||
**Target Platform**: Linux server
|
||||
**Project Type**: Web application (backend API + frontend client)
|
||||
**Performance Goals**: SC-003 — book processing time ≤ 3× text-only for ≤ 500 pages
|
||||
**Constraints**: REST API only (Constitution III); Marker server must be running locally;
|
||||
S3-compatible storage configured via env vars
|
||||
**Scale/Scope**: POC — handful of books, <10 users
|
||||
|
||||
## Constitution Check
|
||||
|
||||
*GATE: Must pass before Phase 0 research. Re-checked after Phase 1 design.*
|
||||
|
||||
| Principle | Status | Notes |
|
||||
|-----------|--------|-------|
|
||||
| **I. KISS** | ✅ Justified | Marker replaces a bespoke PDFBox column heuristic + Google Cloud SDK with one HTTP call. Net complexity reduction vs. the Document AI approach. |
|
||||
| **II. Easy to Change** | ✅ | `MarkerPageParser` is the only class that knows about Marker; swap the implementation to replace Marker with any other parser. `PageResult` DTO remains unchanged. |
|
||||
| **III. Web-First** | ✅ | Internal pipeline change; no public API contract change. |
|
||||
| **IV. Documentation** | ✅ | README must be updated to show Marker as a local external service. |
|
||||
|
||||
## Project Structure
|
||||
|
||||
### Documentation (this feature)
|
||||
|
||||
```text
|
||||
specs/002-image-aware-embedding/
|
||||
├── plan.md # This file
|
||||
├── research.md # Phase 0 output
|
||||
├── data-model.md # Phase 1 output
|
||||
├── quickstart.md # Phase 1 output
|
||||
├── contracts/
|
||||
│ ├── api.md # HTTP API contracts (unchanged from initial plan)
|
||||
│ └── marker-page-result.md # Internal DTO contract (MarkerPageParser → downstream)
|
||||
└── tasks.md # Phase 2 output (/speckit.tasks — not created here)
|
||||
```
|
||||
|
||||
### Source Code
|
||||
|
||||
```text
|
||||
backend/
|
||||
├── src/main/java/com/aiteacher/
|
||||
│ ├── config/
|
||||
│ │ └── MarkerConfig.java # NEW: RestClient bean + base-url property
|
||||
│ ├── document/
|
||||
│ │ ├── MarkerPageParser.java # NEW: replaces DocumentAiPageParser + PdfStructureParser
|
||||
│ │ ├── PageResult.java # UPDATED: FigureBbox → FigureData (bytes not bbox)
|
||||
│ │ ├── FigureExtractionService.java # UPDATED: no PDFBox render; decode bytes directly
|
||||
│ │ ├── TextChunkingService.java # UNCHANGED
|
||||
│ │ ├── VisionDescriptionService.java # UNCHANGED
|
||||
│ │ └── [removed] DocumentAiPageParser.java
|
||||
│ ├── book/
|
||||
│ │ └── BookEmbeddingService.java # MINOR UPDATE: inject MarkerPageParser, drop DocumentAiPageParser
|
||||
│ └── [removed] config/DocumentAiConfig.java
|
||||
├── src/main/resources/
|
||||
│ └── application.yaml # UPDATED: remove document-ai.*, add marker.base-url
|
||||
└── pom.xml # UPDATED: remove google-cloud-document-ai
|
||||
```
|
||||
|
||||
**Structure Decision**: Option 2 (backend + frontend) per constitution Technology Constraints.
|
||||
Frontend changes are display-only (render figure citations inline).
|
||||
|
||||
## Complexity Tracking
|
||||
|
||||
> No constitution violations — Marker reduces complexity compared to the previous
|
||||
> Google Document AI approach (fewer dependencies, no GCP credentials, no 15-page batching).
|
||||
@@ -0,0 +1,121 @@
|
||||
# Quickstart: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-04 (updated: Marker replaces Google Document AI)
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker Compose running (PostgreSQL + pgvector)
|
||||
- OpenAI API key set as env var `OPENAI_API_KEY`
|
||||
- Java 25 + Maven on PATH
|
||||
- **Marker server running** on `http://localhost:8000` (see setup below)
|
||||
- S3-compatible bucket configured (existing setup)
|
||||
|
||||
---
|
||||
|
||||
## Marker Server Setup (one-time)
|
||||
|
||||
Marker is a local Python service — no cloud credentials required.
|
||||
|
||||
```bash
|
||||
# Install (Python 3.10+ required)
|
||||
pip install marker-pdf
|
||||
|
||||
# Start the server on port 8000
|
||||
marker_server --port 8000
|
||||
```
|
||||
|
||||
The server is ready when you see:
|
||||
```
|
||||
INFO: Uvicorn running on http://0.0.0.0:8000
|
||||
```
|
||||
|
||||
Keep the server running in the background (or use a process manager like `systemd` or `screen`).
|
||||
|
||||
---
|
||||
|
||||
## Backend Configuration
|
||||
|
||||
Add or update `backend/src/main/resources/application.yaml`:
|
||||
|
||||
```yaml
|
||||
app:
|
||||
figure-storage:
|
||||
endpoint: https://your-s3-endpoint
|
||||
region: your-region
|
||||
bucket: ${S3_BUCKET:aiteacher}
|
||||
access-key-id: ${S3_ACCESS_KEY_ID}
|
||||
secret-access-key: ${S3_SECRET_ACCESS_KEY}
|
||||
min-image-size-px: 100 # skip decorative images smaller than 100×100 px
|
||||
marker:
|
||||
base-url: ${MARKER_BASE_URL:http://localhost:8000}
|
||||
embedding:
|
||||
batch-size: 20
|
||||
batch-delay-ms: 2000
|
||||
```
|
||||
|
||||
No GCP credentials or project IDs are needed.
|
||||
|
||||
---
|
||||
|
||||
## Database Migration
|
||||
|
||||
Two Flyway migrations run automatically on startup:
|
||||
|
||||
- `V4__document_hierarchy.sql` — adds `chapter` and `section` tables
|
||||
- `V5__figures_and_refs.sql` — adds `figure` and `chunk_figure_ref` tables
|
||||
|
||||
No manual DB setup needed.
|
||||
|
||||
---
|
||||
|
||||
## Re-embedding Existing Books
|
||||
|
||||
Books embedded by feature 001 (text-only) remain functional for text queries. To add image
|
||||
support, trigger a re-embed:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:8080/api/v1/books/{bookId}/reembed \
|
||||
-u admin:password
|
||||
```
|
||||
|
||||
The book transitions to `PROCESSING`, old chunks and figures are deleted, and the new
|
||||
image-aware pipeline runs. Status can be polled via `GET /api/v1/books`.
|
||||
|
||||
---
|
||||
|
||||
## Verifying Image Extraction
|
||||
|
||||
1. Ensure Marker is running: `curl http://localhost:8000` should respond.
|
||||
2. Upload a PDF with diagrams: `POST /api/v1/books/upload`
|
||||
3. Wait for `status: "READY"` via `GET /api/v1/books`
|
||||
4. List figures: `GET /api/v1/books/{id}/figures` — should return at least one entry per image page
|
||||
5. Ask a diagram-specific question in chat — response `sources` should include a `type: "FIGURE"` entry
|
||||
|
||||
---
|
||||
|
||||
## Frontend: Rendering Inline Figures
|
||||
|
||||
The assistant message `content` field will contain figure references in the format
|
||||
`[Fig. 12-4, p.184]`. The frontend should:
|
||||
|
||||
1. Parse `[Fig. X, p.N]` patterns in assistant message text
|
||||
2. Look up the matching entry in `sources` where `type === "FIGURE"`
|
||||
3. Render the figure inline using the `imageUrl` field
|
||||
|
||||
---
|
||||
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
cd backend
|
||||
mvn test
|
||||
```
|
||||
|
||||
Key new test classes:
|
||||
- `MarkerPageParserTest` — unit tests for JSON parsing and block-to-PageResult mapping
|
||||
- `FigureExtractionServiceTest` — unit tests for base64 decode, size filtering, classification
|
||||
- `NeurosurgeryRetrieverTest` — unit tests for dual-search merge and deduplication
|
||||
- `BookEmbeddingServiceIntegrationTest` — integration test: upload PDF with known figures,
|
||||
verify figures appear in `GET /api/v1/books/{id}/figures`
|
||||
@@ -0,0 +1,411 @@
|
||||
# Research: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Branch**: `002-image-aware-embedding` | **Date**: 2026-04-04 (updated: Marker replaces Google Document AI)
|
||||
|
||||
This document resolves all technical unknowns identified during planning. Decisions 1–10 cover
|
||||
the core pipeline. The **Marker Study** section at the bottom explains why Marker was chosen
|
||||
over Google Document AI to drive PDF parsing and figure extraction.
|
||||
|
||||
---
|
||||
|
||||
## Decision 1: Document Hierarchy Model
|
||||
|
||||
**Decision**: Adopt a four-level hierarchy — `BookNode` → `ChapterNode` → `SectionNode` →
|
||||
`TextChunkNode` + `FigureNode`. The `SectionNode` is the pivotal unit: it holds the full section
|
||||
text in Postgres and is used for parent-child context expansion at retrieval time.
|
||||
|
||||
**Rationale**: A flat page-per-document model (current implementation) loses structural context.
|
||||
When a user asks a multi-faceted clinical question, the LLM needs the surrounding section text,
|
||||
not just the matching fragment. Parent-child retrieval — where chunks point to their parent
|
||||
section — is the established pattern for RAG precision. The hierarchy also makes figure-to-section
|
||||
association explicit and queryable.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Keep flat page model, add metadata only → rejected: insufficient for precise citation and
|
||||
context expansion
|
||||
- Chapter-level retrieval (coarser than section) → rejected: too much irrelevant context sent
|
||||
to LLM; cost and latency increase
|
||||
|
||||
---
|
||||
|
||||
## Decision 2: Document Parsing Strategy
|
||||
|
||||
**Decision**: Use **Marker** (local HTTP server, `http://localhost:8000/marker/upload`) as the
|
||||
single entry point for PDF parsing. A single `POST` with `output_format=json` returns:
|
||||
- Reading-order text blocks (headings, paragraphs) — no column-split heuristic needed
|
||||
- Pre-cropped figure images as base64-encoded PNG in the `images` map of each `Figure` block
|
||||
- Table, equation, and code blocks as structured HTML
|
||||
|
||||
`MarkerPageParser` translates the Marker JSON response into `List<PageResult>`, which is the
|
||||
same internal DTO used by the rest of the pipeline.
|
||||
|
||||
**Rationale**: Marker handles column reordering, scanned-page OCR, and figure cropping in one
|
||||
call, eliminating the PDFBox column heuristic (`PdfStructureParser`) and the PDFBox
|
||||
render+crop loop in `FigureExtractionService`. Net result: fewer classes, no cloud dependency,
|
||||
no GCP credentials.
|
||||
|
||||
**Alternatives considered**:
|
||||
- PDFBox column heuristic (previous approach) → rejected: 50/50 split fails on asymmetric
|
||||
columns and scanned pages
|
||||
- Google Document AI Layout Parser → rejected: adds GCP credentials, per-page billing, 15-page
|
||||
batch limit, and still requires PDFBox to render+crop figure regions from bounding boxes.
|
||||
See Marker Study below for detailed comparison.
|
||||
- Screenshot each page + OCR → far slower; loses digital text quality
|
||||
|
||||
---
|
||||
|
||||
## Decision 3: Figure Content Representation
|
||||
|
||||
**Decision**: Generate a textual description of each extracted image using the OpenAI vision
|
||||
model (GPT-4o). This description becomes the `content` field of the figure's vector store
|
||||
document. The figure caption (parsed from the surrounding text) is also included to maximise
|
||||
retrieval signal.
|
||||
|
||||
**Rationale**: Caption-only embedding would miss figures with no caption or with sparse labels.
|
||||
Vision-generated descriptions produce richer semantic content (anatomy terms, structural
|
||||
relationships) that matches clinical queries. The OpenAI client already in use supports image
|
||||
inputs; no additional dependency is required.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Caption-only embedding → insufficient when captions are absent or terse (common in textbooks)
|
||||
- Local vision model (LLaVA) → requires self-hosting; out of scope for POC
|
||||
- OCR only → extracts text visible in image but misses non-text visual content (diagrams, MRI)
|
||||
|
||||
---
|
||||
|
||||
## Decision 4: Dual Vector Search
|
||||
|
||||
**Decision**: At query time, run two parallel similarity searches:
|
||||
1. Text chunk search (filtered by `type = "TEXT"` and `book_id`)
|
||||
2. Figure caption search (filtered by `type = "FIGURE"` and `book_id`)
|
||||
|
||||
Results are merged and deduplicated. The LLM prompt receives the expanded parent section text
|
||||
plus a structured figure reference list.
|
||||
|
||||
**Rationale**: A single search would rank text and figures against each other; figures with
|
||||
terse captions would systematically lose to text chunks. Separate searches with independent
|
||||
`topK` allow tuning each modality independently.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Single search, filter by relevance score → figure captions score lower than text; figures
|
||||
are systematically under-retrieved
|
||||
- Post-process text results to look up linked figures only → misses figures that are relevant
|
||||
to the query but not explicitly referenced in the retrieved text chunks
|
||||
|
||||
---
|
||||
|
||||
## Decision 5: Chunk-to-Figure Linking
|
||||
|
||||
**Decision**: During text parsing, whenever a pattern matching `Fig.\s+\d+[\-\.]\d+` or
|
||||
`Figure\s+\d+[\-\.]\d+` is found in a chunk, insert a row into the `chunk_figure_refs` table
|
||||
linking `chunkId` → `figureId`. At retrieval time, after text chunks are retrieved, their
|
||||
associated figures are fetched from this table and added to the LLM prompt.
|
||||
|
||||
**Rationale**: Explicit linking ensures that when a text chunk is retrieved, its referenced
|
||||
figures are always surfaced — even if the figure's caption did not score highly in the vector
|
||||
search. This is the higher-recall path; dual search (Decision 4) is the higher-precision path.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Rely entirely on dual vector search → may miss figures referenced in retrieved text but
|
||||
scoring below the topK threshold in the figure search
|
||||
|
||||
---
|
||||
|
||||
## Decision 6: Image Storage
|
||||
|
||||
**Decision**: Marker returns figure images as base64-encoded PNG bytes in the JSON response.
|
||||
`FigureExtractionService` decodes these bytes and passes them to `FigureStorageService`, which
|
||||
persists them to an S3-compatible bucket (`${app.figure-storage.bucket}`). The image path/URL
|
||||
is stored in `figure.image_path` in Postgres.
|
||||
|
||||
The `FigureStorageService` interface is unchanged; only the caller changes (from PDFBox crop
|
||||
to base64 decode).
|
||||
|
||||
**Rationale**: Marker's pre-cropped images remove the need for PDFBox rendering.
|
||||
`FigureStorageService` interface boundary satisfies Constitution Principle II (Easy to Change).
|
||||
|
||||
**Alternatives considered**:
|
||||
- Store base64 in Postgres JSONB → bloats DB; complicates backup; query performance degrades
|
||||
|
||||
---
|
||||
|
||||
## Decision 7: Figure Type Classification
|
||||
|
||||
**Decision**: Use the enum `FigureType { ANATOMICAL_DIAGRAM, SURGICAL_PHOTOGRAPH, MRI_CT_SCAN,
|
||||
TABLE, CHART, INTRAOPERATIVE_IMAGE }`. Classification is derived from:
|
||||
1. Caption keywords ("MRI", "CT", "Fig.", "Table") — heuristic, no model needed
|
||||
2. Marker `block_type` hint (`"Table"` → TABLE, `"Figure"` / `"Picture"` → ANATOMICAL_DIAGRAM default)
|
||||
3. Fall back to `ANATOMICAL_DIAGRAM` if unclassifiable
|
||||
|
||||
**Rationale**: Allows the frontend to render different icon/label per type (e.g., "MRI" badge).
|
||||
Heuristic classification avoids a separate model call per image at extraction time.
|
||||
|
||||
**Alternatives considered**:
|
||||
- Vision model classification → accurate but adds latency and cost per figure; deferrable
|
||||
- Single `FIGURE` type → loses citation granularity required by spec FR-004
|
||||
|
||||
---
|
||||
|
||||
## Decision 8: Metadata Schema for Vector Store Documents
|
||||
|
||||
**Decision**: All vector store documents carry a flat `Map<String, Object>` metadata for Spring
|
||||
AI filtering. Schema:
|
||||
|
||||
| Field | Text Chunk | Figure Chunk |
|
||||
|-------|-----------|-------------|
|
||||
| `type` | `"TEXT"` | `"FIGURE"` |
|
||||
| `book_id` | ✓ | ✓ |
|
||||
| `book_title` | ✓ | ✓ |
|
||||
| `chapter_id` | ✓ | ✓ |
|
||||
| `section_id` | ✓ | ✓ |
|
||||
| `section_title` | ✓ | ✓ |
|
||||
| `page_start` | ✓ | — |
|
||||
| `page_end` | ✓ | — |
|
||||
| `chunk_index` | ✓ | — |
|
||||
| `total_chunks` | ✓ | — |
|
||||
| `figure_id` | — | ✓ |
|
||||
| `figure_type` | — | ✓ |
|
||||
| `image_path` | — | ✓ |
|
||||
| `label` | — | ✓ |
|
||||
| `page` | — | ✓ |
|
||||
|
||||
**Rationale**: Flat map is required by Spring AI `FilterExpressionBuilder`. Separation by `type`
|
||||
allows independent filtering in dual search.
|
||||
|
||||
---
|
||||
|
||||
## Decision 9: Re-embedding Existing Books
|
||||
|
||||
**Decision**: Books already processed under feature 001 (text-only) are NOT automatically
|
||||
re-embedded. An explicit re-embed action is exposed via `POST /api/v1/books/{id}/reembed`
|
||||
(admin-triggered). The existing chunks remain valid for text queries until re-embedding completes.
|
||||
|
||||
**Rationale**: Automatic re-embedding on deploy would block the system and risk data loss if
|
||||
the process fails mid-way. An explicit, idempotent trigger is safer and more observable.
|
||||
|
||||
---
|
||||
|
||||
## Decision 10: Minimum Image Size Threshold
|
||||
|
||||
**Decision**: Images smaller than 100×100 pixels are discarded and no chunk is created. Marker
|
||||
returns PNG bytes; `FigureExtractionService` decodes to `BufferedImage` solely to check
|
||||
dimensions. This threshold filters out decorative elements without a classification model.
|
||||
|
||||
**Rationale**: Neurosurgery textbook diagrams and MRI scans are never smaller than 100×100 px.
|
||||
The threshold is configurable via `app.figure-storage.min-image-size-px`.
|
||||
|
||||
**Alternatives considered**:
|
||||
- No threshold → decorative icons pollute the figure index
|
||||
- ML-based classification → accurate but adds model dependency; not needed at POC scale
|
||||
|
||||
---
|
||||
|
||||
# Marker Study — Why Marker Replaces Google Document AI
|
||||
|
||||
*Added 2026-04-04.*
|
||||
|
||||
## What Marker Offers
|
||||
|
||||
Marker is an open-source, locally-runnable PDF-to-structured-content converter that uses a
|
||||
pipeline of deep-learning models (surya for OCR + layout detection, texify for equations).
|
||||
Key capabilities relevant to this project:
|
||||
|
||||
| Capability | Marker | Google Document AI |
|
||||
|-----------|--------|--------------------|
|
||||
| Multi-column reading order | ✅ | ✅ |
|
||||
| OCR on scanned pages | ✅ | ✅ |
|
||||
| Figure detection | ✅ returns pre-cropped images | ⚠️ returns bbox only; PDFBox still needed |
|
||||
| Table extraction | ✅ HTML tables | ✅ |
|
||||
| JSON output with image bytes | ✅ base64 in `images` map | ❌ |
|
||||
| No cloud credentials | ✅ | ❌ GCP service account required |
|
||||
| No per-page billing | ✅ | ❌ ~$10/1,000 pages |
|
||||
| Batch size limits | None (local) | 15 pages / 20 MB per sync call |
|
||||
| Setup | `pip install marker-pdf && marker_server` | GCP project + processor + IAM |
|
||||
|
||||
---
|
||||
|
||||
## Does Marker Solve the Current Pain Points?
|
||||
|
||||
### Pain Point 1: Naive 50/50 Column Split
|
||||
|
||||
**Answer: Yes, Marker fixes this completely.**
|
||||
|
||||
`PdfStructureParser.extractPageText()` splits pages at the horizontal midpoint with a 20%
|
||||
threshold. This fails on asymmetric columns and scanned pages. Marker's surya layout model
|
||||
returns blocks in natural reading order — no heuristic needed.
|
||||
|
||||
### Pain Point 2: Figure Detection Misses Rasterized Figures
|
||||
|
||||
**Answer: Yes, Marker fixes this for most cases.**
|
||||
|
||||
`FigureExtractionService` previously iterated PDF XObjects (only finds embedded XObject images,
|
||||
misses rasterized figures and vector-path drawings). Marker's layout model detects visual
|
||||
elements by type and returns the cropped image bytes directly — no PDFBox page rendering needed.
|
||||
|
||||
### Pain Point 3: OCR on Scanned Pages
|
||||
|
||||
**Answer: Yes, Marker handles scanned pages transparently via surya OCR.**
|
||||
|
||||
### Pain Point 4: Caption Detection
|
||||
|
||||
**Answer: Improved — Marker groups caption blocks with their figure block.**
|
||||
|
||||
The `block_type = "Caption"` block appears as a sibling or child adjacent to the `"Figure"`
|
||||
block in the Marker JSON, making caption association structural rather than regex-based.
|
||||
|
||||
---
|
||||
|
||||
## Marker API Integration
|
||||
|
||||
### Local Server Setup
|
||||
|
||||
```bash
|
||||
pip install marker-pdf
|
||||
marker_server --port 8000
|
||||
```
|
||||
|
||||
The server exposes `POST /marker/upload` (the user's configured endpoint).
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
POST http://localhost:8000/marker/upload
|
||||
Content-Type: multipart/form-data
|
||||
|
||||
file=@document.pdf
|
||||
output_format=json
|
||||
```
|
||||
|
||||
### Response (abbreviated)
|
||||
|
||||
```json
|
||||
{
|
||||
"output_format": "json",
|
||||
"output": {
|
||||
"block_type": "Document",
|
||||
"children": [
|
||||
{
|
||||
"block_type": "Page",
|
||||
"id": "/page/0/Page/0",
|
||||
"children": [
|
||||
{
|
||||
"block_type": "SectionHeader",
|
||||
"id": "/page/0/SectionHeader/0",
|
||||
"html": "<h1>Cavernous Sinus Anatomy</h1>"
|
||||
},
|
||||
{
|
||||
"block_type": "Text",
|
||||
"id": "/page/0/Text/1",
|
||||
"html": "<p>The cavernous sinus contains...</p>"
|
||||
},
|
||||
{
|
||||
"block_type": "Figure",
|
||||
"id": "/page/0/Figure/2",
|
||||
"html": "<figure><img src='/page/0/Figure/2'/></figure>",
|
||||
"images": {
|
||||
"/page/0/Figure/2": "iVBORw0KGgo..."
|
||||
}
|
||||
},
|
||||
{
|
||||
"block_type": "Caption",
|
||||
"id": "/page/0/Caption/3",
|
||||
"html": "<p>Fig. 12-4. Coronal cross-section...</p>"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": { "page_stats": [...] }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Java Integration Pattern
|
||||
|
||||
```java
|
||||
// MarkerPageParser — core call
|
||||
MultiValueMap<String, Object> body = new LinkedMultiValueMap<>();
|
||||
body.add("file", new FileSystemResource(pdfPath));
|
||||
body.add("output_format", "json");
|
||||
|
||||
JsonNode response = restClient.post()
|
||||
.uri(baseUrl + "/marker/upload")
|
||||
.contentType(MediaType.MULTIPART_FORM_DATA)
|
||||
.body(body)
|
||||
.retrieve()
|
||||
.body(JsonNode.class);
|
||||
|
||||
JsonNode document = response.get("output");
|
||||
```
|
||||
|
||||
### Mapping Marker Blocks to PageResult
|
||||
|
||||
```
|
||||
Page block (id "/page/N/Page/M") → PageResult(pageNumber = N+1)
|
||||
SectionHeader children → headingTitle (first match)
|
||||
Text, TextInlineMath children → orderedText (HTML stripped, joined \n\n)
|
||||
Figure children with images map → FigureData(imageBytes = base64decode(images[id]))
|
||||
Caption sibling of Figure → FigureData.nearestCaption
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Change
|
||||
|
||||
```
|
||||
Before (Document AI — removed):
|
||||
DocumentAiPageParser
|
||||
→ Google Document AI API (GCP, 15-page batches, credentials)
|
||||
→ returns text blocks + figure bboxes
|
||||
PdfStructureParser (PDFBox column heuristic)
|
||||
FigureExtractionService
|
||||
→ renders page via PDFBox at 150 DPI
|
||||
→ crops bbox region
|
||||
|
||||
After (Marker):
|
||||
MarkerPageParser
|
||||
→ POST PDF to http://localhost:8000/marker/upload (output_format=json)
|
||||
→ returns text blocks (correct reading order) + Figure blocks with base64 images
|
||||
→ produces List<PageResult> (same DTO, FigureData carries bytes not bbox)
|
||||
FigureExtractionService (simplified)
|
||||
→ base64-decodes image bytes from PageResult.FigureData
|
||||
→ checks min size (ImageIO.read → getWidth/getHeight)
|
||||
→ saves to S3 via FigureStorageService (UNCHANGED)
|
||||
VisionDescriptionService (UNCHANGED)
|
||||
BookEmbeddingService orchestration (MINOR: inject MarkerPageParser)
|
||||
```
|
||||
|
||||
**What is removed**:
|
||||
- `DocumentAiPageParser` — replaced by `MarkerPageParser`
|
||||
- `DocumentAiConfig` — replaced by `MarkerConfig`
|
||||
- `PdfStructureParser` — Marker handles reading order
|
||||
- `google-cloud-document-ai` Maven dependency
|
||||
- `app.document-ai.*` configuration properties
|
||||
|
||||
**What stays the same**:
|
||||
- `PageResult` DTO structure (fields renamed, not restructured)
|
||||
- `FigureExtractionService` public interface
|
||||
- `TextChunkingService`, `VisionDescriptionService`, `BookEmbeddingService` orchestration
|
||||
- All JPA entities, repositories, vector store, S3 storage
|
||||
|
||||
---
|
||||
|
||||
## Constitution Compliance
|
||||
|
||||
| Principle | Assessment |
|
||||
|-----------|------------|
|
||||
| **I. KISS** | ✅ Simpler than Document AI — one HTTP call replaces GCP SDK + PDFBox render loop. No new dependency beyond an HTTP client (Spring RestClient, already available). |
|
||||
| **II. Easy to Change** | ✅ `MarkerPageParser` is the only Marker-aware class. Swap it to use any other parser. `PageResult` DTO unchanged in contract. |
|
||||
| **III. Web-First** | ✅ Internal pipeline change; no API contract change. |
|
||||
| **IV. Documentation** | ✅ README must show Marker as a local external service dependency. |
|
||||
|
||||
---
|
||||
|
||||
## Risks & Mitigations
|
||||
|
||||
| Risk | Likelihood | Mitigation |
|
||||
|------|-----------|------------|
|
||||
| Marker server not running when book is uploaded | Medium | `BookEmbeddingService` catches exception from `MarkerPageParser`, marks book as `FAILED`, logs full error. |
|
||||
| Marker misses some figures (complex PDFs) | Medium | `app.figure-storage.min-image-size-px` threshold can be tuned. Add fallback: if Marker returns 0 figures for a page with known images, log a warning. |
|
||||
| SC-003 (≤ 3× processing time) violated | Low | Marker runs locally (no network latency to cloud). Benchmark with a real 500-page book early. |
|
||||
| Large PDF upload to Marker (>100MB) | Low | Marker server handles the full file; no batching needed. Multipart upload limit configurable. |
|
||||
| Marker image quality vs PDFBox crop | Low | Marker crops at native resolution; quality is equivalent or better than 150 DPI PDFBox render. |
|
||||
@@ -0,0 +1,176 @@
|
||||
# Feature Specification: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Feature Branch**: `002-image-aware-embedding`
|
||||
**Created**: 2026-04-03
|
||||
**Status**: Draft
|
||||
**Input**: User description: "I want to enhance the embedding process. I want also parse image from each pages if any and add proper metadata so that it can match the retrieved chunk/vector that match what user are querying."
|
||||
|
||||
## User Scenarios & Testing *(mandatory)*
|
||||
|
||||
### User Story 1 - Image Content Surfaced in Query Results (Priority: P1)
|
||||
|
||||
A neurosurgeon asks a question in the chat (e.g., "Show me the anatomy of the Circle of Willis")
|
||||
that is best answered by a diagram or figure in an uploaded book. The system retrieves the image
|
||||
content — its description and surrounding context — and uses it to construct a grounded answer,
|
||||
citing the page and book where the image appeared.
|
||||
|
||||
**Why this priority**: This is the direct, user-visible payoff of the feature. Without it, the
|
||||
enhancement has no observable benefit. All other stories support this outcome.
|
||||
|
||||
**Independent Test**: Upload a book containing a labelled anatomical diagram. Ask a query whose
|
||||
answer is conveyed by that diagram (not in the surrounding text). Confirm the system returns an
|
||||
answer that references the diagram's content and cites the correct book and page.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a book with an anatomical diagram on page 42, **When** a user asks a question whose
|
||||
answer is only depicted in that diagram, **Then** the system returns a response that draws on
|
||||
the diagram's content and cites "Page 42, [Book Title]".
|
||||
2. **Given** a page with both text and an image, **When** the system retrieves that page's content,
|
||||
**Then** the image-derived content and the surrounding text are each independently retrievable
|
||||
and independently citable.
|
||||
3. **Given** a query that has no relevant image in any uploaded book, **When** the system searches,
|
||||
**Then** it does not fabricate image-derived content and falls back to text-only results (or
|
||||
states no relevant content was found).
|
||||
|
||||
---
|
||||
|
||||
### User Story 2 - All Pages Scanned for Images During Embedding (Priority: P1)
|
||||
|
||||
When a book is uploaded and processed, every page is inspected for images. Any image found is
|
||||
extracted and represented as a searchable content chunk enriched with metadata (page number,
|
||||
book title, position on page, caption if present). Pages without images are processed as
|
||||
text-only chunks, unchanged from the existing behaviour.
|
||||
|
||||
**Why this priority**: This is the prerequisite for User Story 1. Without systematic per-page
|
||||
image detection, image content cannot be retrieved.
|
||||
|
||||
**Independent Test**: Upload a book whose pages include a mix of text-only and image-containing
|
||||
pages. After processing completes, verify that chunks exist for each image page and that each
|
||||
image chunk carries the correct metadata (page number, source book, caption).
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a book being processed, **When** the embedding pipeline runs, **Then** every page
|
||||
is evaluated for images and each detected image generates at least one content chunk.
|
||||
2. **Given** an image with a caption or label, **When** the chunk is created, **Then** the
|
||||
caption or label text is included in the chunk's content and metadata.
|
||||
3. **Given** a page with multiple images, **When** processing completes, **Then** each image is
|
||||
represented as a separate chunk with its own metadata, not merged into a single chunk.
|
||||
4. **Given** a page with no images, **When** processing completes, **Then** no image chunk is
|
||||
created for that page and text processing is unaffected.
|
||||
|
||||
---
|
||||
|
||||
### User Story 3 - Rich Metadata Enables Precise Source Attribution (Priority: P2)
|
||||
|
||||
When the system returns a result based on image content, the user can see exactly where that
|
||||
image appeared: which book, which page, and what type of content (diagram, table, photograph,
|
||||
etc.). This gives the user confidence in the source and lets them locate the original image
|
||||
in their physical or digital copy of the book.
|
||||
|
||||
**Why this priority**: Metadata quality directly impacts user trust. Neurosurgeons require
|
||||
traceable, citable evidence. Richer metadata also improves retrieval accuracy by giving the
|
||||
search engine more signals to match against a query.
|
||||
|
||||
**Independent Test**: Retrieve a result sourced from an image chunk. Inspect the displayed
|
||||
citation and verify it includes: book title, page number, content type (e.g., "diagram"),
|
||||
and caption (if present in the original).
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** a retrieved image chunk, **When** the system displays the source citation,
|
||||
**Then** the citation includes at minimum: book title, page number, and a content-type
|
||||
label (e.g., diagram, table, figure).
|
||||
2. **Given** an image chunk with a detected caption, **When** the citation is displayed,
|
||||
**Then** the caption text is shown alongside the other metadata fields.
|
||||
3. **Given** a topic summary that draws on both text and image chunks, **When** the user
|
||||
inspects citations, **Then** image-sourced and text-sourced claims are distinguishable
|
||||
from each other.
|
||||
|
||||
---
|
||||
|
||||
### Edge Cases
|
||||
|
||||
- What happens when an image is too small to contain meaningful content (e.g., a decorative
|
||||
bullet icon or a publisher logo)?
|
||||
- How does the system handle a page that is entirely an image (scanned page with no digital text)?
|
||||
- What if an image spans multiple pages (e.g., a fold-out diagram)?
|
||||
- How does the system behave when an image has no caption and its surrounding text provides
|
||||
no useful context?
|
||||
- What happens if image processing fails for a specific page — does it abort the whole book
|
||||
or continue with the remaining pages?
|
||||
|
||||
## Requirements *(mandatory)*
|
||||
|
||||
### Functional Requirements
|
||||
|
||||
- **FR-001**: System MUST inspect every page of an uploaded book for the presence of images
|
||||
during the embedding process.
|
||||
- **FR-002**: System MUST extract each detected image and create a dedicated, independently
|
||||
searchable content chunk for it.
|
||||
- **FR-003**: System MUST generate a descriptive textual representation of each extracted
|
||||
image so its content is semantically searchable by the retrieval system.
|
||||
- **FR-004**: System MUST associate the following metadata with every image chunk: book title,
|
||||
page number, content type (e.g., diagram, table, figure, photograph), and caption text
|
||||
(where present).
|
||||
- **FR-005**: System MUST include the same base metadata (book title, page number) on text
|
||||
chunks so that all retrieved content — image or text — carries consistent, comparable
|
||||
source attribution.
|
||||
- **FR-006**: System MUST treat image chunks as first-class retrievable units: they must be
|
||||
ranked and returned alongside text chunks when they are relevant to a user query.
|
||||
- **FR-007**: System MUST skip images that fall below a minimum meaningful-content threshold
|
||||
(e.g., decorative icons, page separators) and MUST NOT create chunks for them.
|
||||
- **FR-008**: If image processing fails for a specific page, the system MUST log the failure,
|
||||
skip that page's image, and continue processing the remaining pages and text content of
|
||||
the book.
|
||||
- **FR-009**: System MUST display image-sourced content citations distinctly from text-sourced
|
||||
citations so users can identify when a result originates from a visual element.
|
||||
- **FR-010**: Processing a book that contains images MUST NOT degrade the accuracy or
|
||||
completeness of the existing text-only embedding for that book.
|
||||
|
||||
### Key Entities
|
||||
|
||||
- **Image Chunk**: A searchable content unit derived from a page image. Attributes: generated
|
||||
description, source book title, page number, content type, caption (optional), embedding vector.
|
||||
- **Text Chunk**: Existing unit; extended to carry explicit metadata: source book title,
|
||||
page number, section heading (if detectable), content type ("text").
|
||||
- **Chunk Metadata**: Structured attributes attached to every chunk regardless of type,
|
||||
enabling consistent filtering and citation. Mandatory fields: book title, page number,
|
||||
content type. Optional fields: caption, section heading.
|
||||
|
||||
## Success Criteria *(mandatory)*
|
||||
|
||||
### Measurable Outcomes
|
||||
|
||||
- **SC-001**: At least 90% of pages containing images in a test book result in a retrievable
|
||||
image chunk after processing completes.
|
||||
- **SC-002**: A controlled set of 10 queries whose answers are conveyed by diagrams in an
|
||||
uploaded book returns at least 7 correct image-sourced answers (70% recall on image queries).
|
||||
- **SC-003**: Embedding processing time for a book with images increases by no more than 3×
|
||||
compared to processing the same book as text-only, for books up to 500 pages.
|
||||
- **SC-004**: Every retrieved result — text or image — includes a citation that identifies
|
||||
at minimum the source book title and page number, with 100% coverage across a test result set.
|
||||
- **SC-005**: In a user evaluation with 5 representative queries that previously returned
|
||||
no useful results (because the answer was only in a diagram), at least 4 now return a
|
||||
useful, grounded answer.
|
||||
|
||||
## Assumptions
|
||||
|
||||
- Books are still uploaded exclusively as PDFs; image parsing applies to PDF pages only.
|
||||
- The platform already has a working text-only embedding pipeline (from feature 001); this
|
||||
feature enhances it without replacing or rewriting the text processing logic.
|
||||
- Images worth processing are those that occupy a meaningful portion of the page; small
|
||||
decorative or structural images (logos, dividers, icons) are excluded based on a size
|
||||
threshold determined during implementation.
|
||||
- The descriptive representation of an image (FR-003) is generated at embedding time, not
|
||||
at query time; query latency is not affected by image interpretation.
|
||||
- The shared global book library model from feature 001 is retained; image chunks from a
|
||||
processed book are available to all users immediately upon completion.
|
||||
- Scanned pages (fully rasterised pages with no digital text layer) are treated as a single
|
||||
full-page image; the system attempts to extract content from them but does not guarantee
|
||||
the same fidelity as pages with digital text.
|
||||
- Per-chunk metadata is stored alongside the vector so it can be used for both retrieval
|
||||
filtering and source citation display without a separate lookup.
|
||||
- Books already processed under feature 001 (text-only) are not automatically re-processed;
|
||||
re-embedding must be triggered explicitly by the user or an administrator.
|
||||
@@ -0,0 +1,169 @@
|
||||
# Tasks: Enhanced Embedding with Image Parsing and Metadata
|
||||
|
||||
**Input**: Design documents from `/specs/002-image-aware-embedding/`
|
||||
**Prerequisites**: plan.md ✓ | spec.md ✓ | research.md ✓ | data-model.md ✓ | contracts/ ✓
|
||||
|
||||
**Organization**: Tasks grouped by user story to enable independent implementation and testing.
|
||||
|
||||
## Format: `[ID] [P?] [Story] Description`
|
||||
|
||||
- **[P]**: Can run in parallel (different files, no shared dependencies)
|
||||
- **[US1/US2/US3]**: Which user story this task belongs to
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Setup (Shared Infrastructure)
|
||||
|
||||
**Purpose**: Database migrations and configuration that establish the foundation for all new code
|
||||
|
||||
- [X] T001 Create Flyway migration `V4__document_hierarchy.sql` — add `chapter` and `section` tables per data-model.md §Postgres Schema in `backend/src/main/resources/db/migration/V4__document_hierarchy.sql`
|
||||
- [X] T002 Create Flyway migration `V5__figures_and_refs.sql` — add `figure` and `chunk_figure_ref` tables per data-model.md §Postgres Schema in `backend/src/main/resources/db/migration/V5__figures_and_refs.sql`
|
||||
- [X] T003 Add figure-storage configuration keys to `backend/src/main/resources/application.properties`: `app.figure-storage.base-path=./uploads` and `app.figure-storage.min-image-size-px=100`
|
||||
- [X] T004 Add `uploads/` directory to `.gitignore` at repo root; create `uploads/figures/.gitkeep` to preserve directory structure
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Foundational (Blocking Prerequisites)
|
||||
|
||||
**Purpose**: Core types and infrastructure that ALL user stories depend on — nothing in Phase 3+ can start until this phase is complete
|
||||
|
||||
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
|
||||
|
||||
- [X] T005 [P] Create `FigureType` enum in `backend/src/main/java/com/aiteacher/document/FigureType.java` — values: `ANATOMICAL_DIAGRAM`, `SURGICAL_PHOTOGRAPH`, `MRI_CT_SCAN`, `TABLE`, `CHART`, `INTRAOPERATIVE_IMAGE`
|
||||
- [X] T006 [P] Create `FigureStorageService` interface in `backend/src/main/java/com/aiteacher/figure/FigureStorageService.java` — declare `Path save(UUID bookId, String figureId, BufferedImage image)`, `Path resolve(UUID bookId, String filename)`, and `void delete(UUID bookId)`
|
||||
- [X] T007 Create `LocalFigureStorageService` implementation in `backend/src/main/java/com/aiteacher/figure/LocalFigureStorageService.java` — writes PNG files under `${app.figure-storage.base-path}/figures/{bookId}/`; implements `FigureStorageService`; depends on T006
|
||||
- [X] T008 Create `FigureStorageConfig` bean in `backend/src/main/java/com/aiteacher/config/FigureStorageConfig.java` — reads `app.figure-storage.base-path` and `app.figure-storage.min-image-size-px` as `@ConfigurationProperties`; registers `LocalFigureStorageService` as `@Bean`; adds `ResourceHandler` mapping `GET /api/v1/figures/**` to the base-path directory
|
||||
- [X] T009 [P] Create `ChapterEntity` JPA entity and `ChapterRepository` in `backend/src/main/java/com/aiteacher/document/` — `@Entity(name="chapter")`, fields: `id` (String PK), `bookId` (UUID FK → book), `number` (int), `title` (String), `pageStart` (int), `createdAt` (Instant); `ChapterRepository extends JpaRepository<ChapterEntity, String>`
|
||||
- [X] T010 [P] Create `SectionEntity` JPA entity and `SectionRepository` in `backend/src/main/java/com/aiteacher/document/` — `@Entity(name="section")`, fields: `id` (String PK), `chapterId` (String FK → chapter), `bookId` (UUID FK → book), `number` (String), `title` (String), `pageStart`/`pageEnd` (int), `fullText` (TEXT column), `createdAt` (Instant); `SectionRepository extends JpaRepository<SectionEntity, String>` with `findAllByBookId(UUID)`
|
||||
- [X] T011 [P] Create `FigureEntity` JPA entity and `FigureRepository` in `backend/src/main/java/com/aiteacher/document/` — `@Entity(name="figure")`, fields: `id` (String PK), `bookId` (UUID), `sectionId` (String, nullable), `chapterId` (String, nullable), `label` (String), `caption` (TEXT), `figureType` (`@Enumerated` FigureType), `page` (int), `imagePath` (String), `captionEmbeddingId` (UUID, nullable), `createdAt` (Instant); `FigureRepository` with `findAllByBookId(UUID)`, `deleteAllByBookId(UUID)`
|
||||
- [X] T012 Create `ChunkFigureRefEntity` JPA entity and `ChunkFigureRefRepository` in `backend/src/main/java/com/aiteacher/document/` — composite PK `(chunkId UUID, figureId String)`, `mentionPage` (int); `ChunkFigureRefRepository` with `findByChunkIdIn(List<UUID>)`, `deleteByFigureIdIn(List<String>)`
|
||||
|
||||
**Checkpoint**: Migrations will run on next startup; all JPA entities are wired; figure storage reads config correctly
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: User Story 2 — All Pages Scanned for Images During Embedding (Priority: P1)
|
||||
|
||||
**Goal**: When a book is uploaded, every page is inspected for images; each found image is extracted, persisted, described, and embedded as a searchable chunk alongside its metadata
|
||||
|
||||
**Independent Test**: Upload a PDF containing at least one page with a labelled anatomical diagram. After status shows `READY`, call `GET /api/v1/books/{id}/figures` — response must contain at least one entry with `figureType`, `caption`, `page`, and `imageUrl` populated. Verify the PNG file exists at the path in `imagePath`.
|
||||
|
||||
- [X] T013 [US2] ~~Create `PdfStructureParser`~~ → **SUPERSEDED**: PDF parsing is handled by `MarkerPageParser` (see T013b). `PdfStructureParser` exists but is not wired into the pipeline.
|
||||
- [X] T013b [US2] Create `MarkerPageParser` in `backend/src/main/java/com/aiteacher/document/MarkerPageParser.java` — POSTs PDF to `http://localhost:8000/marker/upload?output_format=json` via Spring `RestClient`; parses JSON response into `List<PageResult>` (one per page block); extracts heading, ordered text, and pre-cropped figure PNG bytes per page
|
||||
- [X] T014 [US2] Update `FigureExtractionService` in `backend/src/main/java/com/aiteacher/document/FigureExtractionService.java` — **Marker migration**: removed PDFBox rendering + bbox-crop loop; decodes PNG bytes from `PageResult.FigureData` via `ImageIO.read()`; skips images below `min-image-size-px`; classifies `FigureType`; saves via `FigureStorageService`; persists `FigureEntity`
|
||||
- [X] T015 [US2] Create `VisionDescriptionService` in `backend/src/main/java/com/aiteacher/document/VisionDescriptionService.java` — accepts a `Path` to a PNG and a caption String; calls the OpenAI vision model (via Spring AI `ChatClient` with image media type) to generate a 2–4 sentence clinical description; returns the generated description string; handles API failures by returning the caption as fallback
|
||||
- [X] T016 [US2] Create `TextChunkingService` in `backend/src/main/java/com/aiteacher/document/TextChunkingService.java` — accepts a `SectionEntity`; splits `fullText` into overlapping 400–600 token windows (20-token overlap); wraps each window in a Spring AI `Document` with the flat metadata map defined in data-model.md §Text chunk document; returns `List<Document>`
|
||||
- [X] T017 [US2] Create `ChunkFigureRefService` in `backend/src/main/java/com/aiteacher/document/ChunkFigureRefService.java` — accepts a Spring AI `Document` (with its `id` as `chunkId`) and a `List<FigureEntity>` for the book; scans chunk text for patterns `Fig\.\s*\d+[\-\.]\d+` and `Figure\s+\d+[\-\.]\d+`; matches against figure labels; persists `ChunkFigureRefEntity` rows via `ChunkFigureRefRepository`
|
||||
- [X] T018 [US2] Update `BookEmbeddingService.embedBook()` — **Marker migration**: injected `MarkerPageParser` replacing `DocumentAiPageParser`; updated `figureExtractionService.extract()` call (removed `pdfPath` arg); updated log message. Pipeline: (1) `MarkerPageParser` → `List<PageResult>`; (2) `buildAndSaveSections()` → sections; (3) `TextChunkingService` → chunks → embed; (4) `FigureExtractionService.extract()` → figures; (5) `VisionDescriptionService` → embed figure chunks; (6) `ChunkFigureRefService` → refs
|
||||
- [X] T019 [US2] Extend `BookEmbeddingService.deleteBookChunks()` to also delete: all `ChunkFigureRefEntity` rows (via `findByFigureIdIn`), all `FigureEntity` rows (via `deleteAllByBookId`), all figure PNG files (via `FigureStorageService.delete(bookId)`), all `SectionEntity` and `ChapterEntity` rows for the book
|
||||
- [X] T020 [US2] Add `POST /api/v1/books/{id}/reembed` endpoint to `BookController` in `backend/src/main/java/com/aiteacher/book/BookController.java` — returns `202` with `{ bookId, status: "PROCESSING" }`; returns `404` if not found; returns `409` if already `PROCESSING`; calls `deleteBookChunks()` then `embedBook()` asynchronously
|
||||
|
||||
**Checkpoint**: Upload a PDF with figures → poll `GET /api/v1/books` for `READY` → `GET /api/v1/books/{id}/figures` returns figure list → PNG accessible at `GET /api/v1/figures/{bookId}/{filename}`
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: User Story 1 — Image Content Surfaced in Query Results (Priority: P1)
|
||||
|
||||
**Goal**: User asks a question answered by a diagram — the system retrieves that diagram's content and surfaces it in the chat response with a citation
|
||||
|
||||
**Independent Test**: With a book embedded (Phase 3 checkpoint passed), ask a chat question whose answer is depicted only in a diagram. The response `sources` array must contain at least one entry with `type: "FIGURE"` and a non-empty `imageUrl`.
|
||||
|
||||
- [X] T021 [US1] Create `NeurosurgeryRetriever` service in `backend/src/main/java/com/aiteacher/retrieval/NeurosurgeryRetriever.java` — (1) text chunk search: `vectorStore.similaritySearch` with filter `type == TEXT AND book_id == bookId`, topK=5; (2) figure search: same store, filter `type == FIGURE AND book_id == bookId`, topK=3; (3) expand text chunk results to parent sections via `SectionRepository.findAllById(sectionIds)`; (4) fetch explicitly linked figures via `ChunkFigureRefRepository.findByChunkIdIn(chunkIds)` + `FigureRepository.findAllById`; (5) deduplicate figures across lists by `figureId`; return `RetrievalResult(parentSections, figureVectorHits, linkedFigures)` — add `RetrievalResult` record in same package
|
||||
- [X] T022 [US1] Refactor `ChatService.sendMessage()` in `backend/src/main/java/com/aiteacher/chat/ChatService.java` — replace `QuestionAnswerAdvisor` with a manual call to `NeurosurgeryRetriever`; build the LLM user message from: section full texts as `[Section X.Y — Title, pp.A-B]\n{fullText}` blocks, followed by `AVAILABLE FIGURES FOR THIS SECTION:` list with `- {label} (p.{page}): {caption} [image: {filename}]` lines per figure; append the instruction `When referencing diagrams, cite them as [Fig. X, p.N].`; send via `chatClient.prompt().system(SYSTEM_PROMPT).user(prompt).call()`
|
||||
- [X] T023 [US1] Add `GET /api/v1/books/{id}/figures` endpoint to `BookController` — returns `200` with `List<FigureResponse>`; `FigureResponse` is a new record in `backend/src/main/java/com/aiteacher/book/FigureResponse.java` with fields `figureId`, `label`, `caption`, `figureType`, `page`, `imageUrl` (assembled as `/api/v1/figures/{bookId}/{filename}`), `sectionId`, `sectionTitle`; returns `404` if book not found
|
||||
- [X] T024 [US1] Update `extractSources()` in `ChatService` to build both TEXT and FIGURE source entries: TEXT entries keep existing fields plus `"type": "TEXT"`; FIGURE entries add `"type": "FIGURE"`, `"figureId"`, `"label"`, `"caption"`, `"figureType"`, `"imageUrl"` — source data comes from `RetrievalResult` (text chunk Documents and merged FigureEntity list)
|
||||
|
||||
**Checkpoint**: Chat question answered by a diagram → response body contains `sources[n].type == "FIGURE"` with populated `imageUrl`; image loads from the returned URL
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: User Story 3 — Rich Metadata Enables Precise Source Attribution (Priority: P2)
|
||||
|
||||
**Goal**: Users see distinct, informative citations for text vs. image sources; image sources render inline in the chat UI
|
||||
|
||||
**Independent Test**: After triggering a response with figure sources, inspect the chat message in the UI — text sources and figure sources are visually distinguishable; figure sources render the actual image inline using the `imageUrl`
|
||||
|
||||
- [X] T025 [P] [US3] Update API response types in `frontend/src/services/api.ts` — extend the `Source` type to include `type: 'TEXT' | 'FIGURE'`, `figureId?: string`, `label?: string`, `caption?: string`, `figureType?: string`, `imageUrl?: string`
|
||||
- [X] T026 [P] [US3] Update the chat source/citation display in the frontend (wherever sources are currently rendered, e.g. `frontend/src/components/` or `frontend/src/views/`) — render TEXT sources with a document icon and page number; render FIGURE sources with the image (`<img :src="source.imageUrl">`) below the label and caption text
|
||||
- [X] T027 [US3] Add figure-type badge rendering in the frontend figure display: show a label derived from `figureType` (e.g. "MRI / CT", "Anatomical Diagram", "Table") alongside the figure caption so users can identify content type without opening the image
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Polish & Cross-Cutting Concerns
|
||||
|
||||
- [X] T028 Update `README.md` Mermaid architecture diagram to show three storage tiers: pgvector (semantic search), Postgres (source of truth — sections, figures, refs), and file store (extracted PNGs) — **required by Constitution Principle IV in the same PR as the other changes**
|
||||
- [X] T029 [P] Write `FigureExtractionServiceTest` unit test in `backend/src/test/java/com/aiteacher/document/FigureExtractionServiceTest.java` — test: images below min size are skipped; `FigureType` classification matches keyword table in data-model.md; caption parsed from adjacent text line
|
||||
- [X] T030 [P] Write `NeurosurgeryRetrieverTest` unit test in `backend/src/test/java/com/aiteacher/retrieval/NeurosurgeryRetrieverTest.java` — test: figure IDs from both vector hits and chunk refs are merged without duplicates; `RetrievalResult` contains the deduplicated set
|
||||
- [X] T031 Run quickstart.md validation end-to-end: upload a real PDF with a labelled diagram → wait for `READY` → call `GET /api/v1/books/{id}/figures` → send a chat message about the diagram → verify `sources` contains a `FIGURE` entry → verify `imageUrl` resolves to a PNG
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Execution Order
|
||||
|
||||
### Phase Dependencies
|
||||
|
||||
- **Phase 1 (Setup)**: No dependencies — start immediately
|
||||
- **Phase 2 (Foundational)**: Requires Phase 1 complete (migrations must run before JPA entities can be wired)
|
||||
- **Phase 3 (US2)**: Requires Phase 2 complete — all JPA entities + FigureStorageService must exist
|
||||
- **Phase 4 (US1)**: Requires Phase 3 complete — figures must exist in Postgres + vector store before retrieval can surface them
|
||||
- **Phase 5 (US3)**: Requires Phase 4 complete — frontend depends on the extended `sources` format from T024
|
||||
- **Phase 6 (Polish)**: Requires all story phases complete
|
||||
|
||||
### Within Phase 3 (Embedding Pipeline)
|
||||
|
||||
```
|
||||
T013 (PdfStructureParser) ──────────────────────────┐
|
||||
T014 (FigureExtractionService) ─────────────────────┤
|
||||
T015 (VisionDescriptionService) ────────────────────┤─→ T018 (BookEmbeddingService orchestrator)
|
||||
T016 (TextChunkingService) ─────────────────────────┤ └─→ T019 (cleanup)
|
||||
T017 (ChunkFigureRefService) ───────────────────────┘ └─→ T020 (reembed endpoint)
|
||||
```
|
||||
|
||||
T013–T017 can be implemented in parallel (different files, no shared dependencies). T018 depends on all of them.
|
||||
|
||||
### Within Phase 4 (Retrieval)
|
||||
|
||||
```
|
||||
T021 (NeurosurgeryRetriever) ──────────────────────┐
|
||||
└─→ T022 (ChatService update)
|
||||
└─→ T024 (extractSources update)
|
||||
T023 (figures endpoint) ── independent [P]
|
||||
```
|
||||
|
||||
### Parallel Opportunities per Phase
|
||||
|
||||
**Phase 2**: T005, T006, T009, T010, T011 can all run in parallel. T007 depends on T006. T012 can follow T010/T011.
|
||||
|
||||
**Phase 3**: T013, T014, T015, T016, T017 all in parallel. T018 depends on all.
|
||||
|
||||
**Phase 5**: T025 and T026 in parallel; T027 can follow T026.
|
||||
|
||||
**Phase 6**: T029 and T030 in parallel.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### MVP: User Story 2 Only (Embedding Pipeline)
|
||||
|
||||
1. Phase 1 (Setup) → Phase 2 (Foundational) → Phase 3 (US2, T013–T020)
|
||||
2. **Validate**: `GET /api/v1/books/{id}/figures` returns figures for a test book
|
||||
3. **Stop and demo** — the pipeline produces image chunks without any retrieval changes
|
||||
|
||||
### Full Feature Delivery
|
||||
|
||||
1. Phase 1 + 2 → Foundation ready
|
||||
2. Phase 3 (US2) → Embedding pipeline produces image chunks ← **demo point**
|
||||
3. Phase 4 (US1) → Chat surfaces image content in responses ← **core payoff**
|
||||
4. Phase 5 (US3) → Frontend renders inline figures with type badges
|
||||
5. Phase 6 (Polish) → README, tests, end-to-end validation
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- [P] tasks = different files, no dependencies on each other within the same phase
|
||||
- [US1/US2/US3] label maps each task to a user story for traceability
|
||||
- Phase 3 (US2) must be fully complete before beginning Phase 4 (US1) — retrieval cannot surface figures that do not yet exist
|
||||
- The `uploads/figures/` directory must exist and be writable at runtime; `FigureStorageService` creates subdirectories automatically
|
||||
- Re-embedding (T020) deletes all existing chunks and figures for the book before re-running — safe to call on books processed by feature 001
|
||||
@@ -0,0 +1,35 @@
|
||||
# Specification Quality Checklist: Basic Login Protection
|
||||
|
||||
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
||||
**Created**: 2026-04-06
|
||||
**Feature**: [spec.md](../spec.md)
|
||||
|
||||
## Content Quality
|
||||
|
||||
- [x] No implementation details (languages, frameworks, APIs)
|
||||
- [x] Focused on user value and business needs
|
||||
- [x] Written for non-technical stakeholders
|
||||
- [x] All mandatory sections completed
|
||||
|
||||
## Requirement Completeness
|
||||
|
||||
- [x] No [NEEDS CLARIFICATION] markers remain
|
||||
- [x] Requirements are testable and unambiguous
|
||||
- [x] Success criteria are measurable
|
||||
- [x] Success criteria are technology-agnostic (no implementation details)
|
||||
- [x] All acceptance scenarios are defined
|
||||
- [x] Edge cases are identified
|
||||
- [x] Scope is clearly bounded
|
||||
- [x] Dependencies and assumptions identified
|
||||
|
||||
## Feature Readiness
|
||||
|
||||
- [x] All functional requirements have clear acceptance criteria
|
||||
- [x] User scenarios cover primary flows
|
||||
- [x] Feature meets measurable outcomes defined in Success Criteria
|
||||
- [x] No implementation details leak into specification
|
||||
|
||||
## Notes
|
||||
|
||||
- All items pass. Spec is complete and ready for planning.
|
||||
- FR-012 resolved: credentials are managed via environment variables / config file (no in-app user management UI).
|
||||
@@ -0,0 +1,49 @@
|
||||
# API Contract: Auth
|
||||
|
||||
**Base path**: `/api/v1/auth`
|
||||
**Authentication**: HTTP Basic (all endpoints in this group require valid credentials)
|
||||
|
||||
---
|
||||
|
||||
## GET /api/v1/auth/check
|
||||
|
||||
Verifies that the supplied HTTP Basic credentials are valid. Used by the frontend after a page refresh to confirm stored credentials are still accepted before rendering the app.
|
||||
|
||||
### Request
|
||||
|
||||
```
|
||||
GET /api/v1/auth/check
|
||||
Authorization: Basic <base64(username:password)>
|
||||
```
|
||||
|
||||
No request body.
|
||||
|
||||
### Response — 200 OK
|
||||
|
||||
```json
|
||||
{
|
||||
"username": "neurosurgeon"
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `username` | string | The authenticated username |
|
||||
|
||||
### Response — 401 Unauthorized
|
||||
|
||||
Spring Security returns a standard 401 with `WWW-Authenticate: Basic realm="Realm"` header. No JSON body.
|
||||
|
||||
### Behaviour
|
||||
|
||||
- Returns `200` with the authenticated username if credentials are valid.
|
||||
- Returns `401` if credentials are absent or incorrect.
|
||||
- No side effects (idempotent, read-only).
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- All other existing endpoints (`/api/v1/books`, `/api/v1/chat`, etc.) continue to require HTTP Basic Auth as before.
|
||||
- The frontend sends `Authorization: Basic ...` on every request via the axios request interceptor.
|
||||
- A global axios response interceptor detects `401` responses and redirects the user to `/login`.
|
||||
@@ -0,0 +1,35 @@
|
||||
# Data Model: Basic Login Protection
|
||||
|
||||
**Feature**: 003-basic-login
|
||||
**Date**: 2026-04-06
|
||||
|
||||
## No Backend Schema Changes
|
||||
|
||||
This feature introduces no new database tables or Flyway migrations. The user account is defined entirely in the Spring Security in-memory configuration (`SecurityConfig.java`) backed by environment variables.
|
||||
|
||||
## Frontend: Auth Store State
|
||||
|
||||
The Pinia `authStore` is the single source of truth for authentication state in the frontend.
|
||||
|
||||
```
|
||||
AuthState
|
||||
├── username: string | null — entered username, null if not logged in
|
||||
├── password: string | null — entered password, null if not logged in
|
||||
└── isAuthenticated: boolean — derived: true when both username and password are non-null
|
||||
|
||||
Actions
|
||||
├── login(username, password) — validates credentials via /api/v1/auth/check, stores in sessionStorage on success
|
||||
├── logout() — clears username, password, sessionStorage; redirects to /login
|
||||
└── restoreSession() — reads credentials from sessionStorage on app start; calls /api/v1/auth/check to verify still valid
|
||||
```
|
||||
|
||||
## Backend: Application Properties
|
||||
|
||||
Two properties configure the single allowed user account:
|
||||
|
||||
| Property | Default | Source | Example |
|
||||
|----------|---------|--------|---------|
|
||||
| `app.auth.username` | `neurosurgeon` | `application.yaml` / env var `APP_AUTH_USERNAME` | `admin` |
|
||||
| `app.auth.password` | (required) | env var `APP_AUTH_PASSWORD` | `s3cret` |
|
||||
|
||||
No hashing is applied in the current `SecurityConfig` (`{noop}` prefix). The spec (FR-011) requires passwords not to be stored in plaintext — this refers to the backend config/env var pattern, which is acceptable as env vars are not persisted in the codebase. If hashing is required later, the `{noop}` prefix can be replaced with `{bcrypt}` without other code changes.
|
||||
@@ -0,0 +1,76 @@
|
||||
# Implementation Plan: Basic Login Protection
|
||||
|
||||
**Branch**: `003-basic-login` | **Date**: 2026-04-06 | **Spec**: [spec.md](./spec.md)
|
||||
**Input**: Feature specification from `/specs/003-basic-login/spec.md`
|
||||
|
||||
## Summary
|
||||
|
||||
Add a login page to the Vue frontend so users must enter a username and password before accessing any route. The backend already has Spring Security with HTTP Basic Auth fully configured; credentials are validated on every API call. The implementation introduces a Pinia auth store that holds the entered credentials in `sessionStorage`, an axios interceptor that injects them on every request, a `/login` route with a login form, router guards that redirect unauthenticated users, and a logout button in the navbar. A lightweight `/api/v1/auth/check` endpoint is added to the backend to allow the frontend to verify credentials without side effects. Username is made configurable in the backend (currently hardcoded as "neurosurgeon").
|
||||
|
||||
## Technical Context
|
||||
|
||||
**Language/Version**: Java 21 (backend) / TypeScript + Node 20 (frontend)
|
||||
**Primary Dependencies**: Spring Boot 4.0.5, Spring Security (already included), Vue 3.4, Vue Router 4.3, Pinia 2.1, Axios 1.7
|
||||
**Storage**: No new storage — credentials held in browser `sessionStorage` (frontend only)
|
||||
**Testing**: Spring Boot Test (backend), Vitest (not yet set up — out of scope for this feature)
|
||||
**Target Platform**: Web (SPA + REST API)
|
||||
**Project Type**: Web application (backend API + Vue frontend client)
|
||||
**Performance Goals**: Login response within 1 second under normal load
|
||||
**Constraints**: No new backend dependencies; no database changes; must not break existing API surface
|
||||
**Scale/Scope**: Small team (POC), single user role
|
||||
|
||||
## Constitution Check
|
||||
|
||||
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
||||
|
||||
| Principle | Status | Notes |
|
||||
|-----------|--------|-------|
|
||||
| I. KISS | PASS | HTTP Basic Auth is reused; no new auth protocol, no new dependencies. Frontend uses sessionStorage — no JWT, no refresh tokens. |
|
||||
| II. Easy to Change | PASS | Auth store is a single Pinia store; swapping the auth mechanism later only requires updating the store and the SecurityConfig. |
|
||||
| III. Web-First | PASS | Backend exposes REST endpoint; frontend is standalone SPA client. No server-side rendering added. |
|
||||
| IV. Documentation as Architecture | PASS | README must be updated to show the login flow in the architecture diagram (same PR). |
|
||||
| Technology Constraints | PASS | Still two deployable units (backend + frontend). No new service added. |
|
||||
|
||||
## Project Structure
|
||||
|
||||
### Documentation (this feature)
|
||||
|
||||
```text
|
||||
specs/003-basic-login/
|
||||
├── plan.md # This file
|
||||
├── research.md # Phase 0 output
|
||||
├── data-model.md # Phase 1 output
|
||||
├── quickstart.md # Phase 1 output
|
||||
├── contracts/ # Phase 1 output
|
||||
│ └── auth.md
|
||||
└── tasks.md # Phase 2 output (/speckit.tasks — NOT created by /speckit.plan)
|
||||
```
|
||||
|
||||
### Source Code (repository root)
|
||||
|
||||
```text
|
||||
backend/
|
||||
├── src/main/java/com/aiteacher/
|
||||
│ ├── config/
|
||||
│ │ └── SecurityConfig.java # MODIFY: make username configurable
|
||||
│ └── auth/
|
||||
│ └── AuthController.java # ADD: GET /api/v1/auth/check endpoint
|
||||
|
||||
frontend/
|
||||
├── src/
|
||||
│ ├── stores/
|
||||
│ │ └── authStore.ts # ADD: Pinia store for credentials + session
|
||||
│ ├── views/
|
||||
│ │ └── LoginView.vue # ADD: login form UI
|
||||
│ ├── services/
|
||||
│ │ └── api.ts # MODIFY: read credentials from authStore
|
||||
│ ├── router/
|
||||
│ │ └── index.ts # MODIFY: add /login route + navigation guard
|
||||
│ └── App.vue # MODIFY: add logout button to navbar
|
||||
```
|
||||
|
||||
**Structure Decision**: Option 2 (web application). Existing `backend/` and `frontend/` layout used; no new projects or packages.
|
||||
|
||||
## Complexity Tracking
|
||||
|
||||
> No constitution violations. Table left empty.
|
||||
@@ -0,0 +1,198 @@
|
||||
# Quickstart: Basic Login Protection
|
||||
|
||||
**Feature**: 003-basic-login
|
||||
**Date**: 2026-04-06
|
||||
|
||||
## What Changes
|
||||
|
||||
| Component | Change |
|
||||
|-----------|--------|
|
||||
| `SecurityConfig.java` | Username made configurable via `app.auth.username` property |
|
||||
| `AuthController.java` | New: `GET /api/v1/auth/check` endpoint |
|
||||
| `authStore.ts` | New: Pinia store managing credentials + sessionStorage |
|
||||
| `LoginView.vue` | New: login form page |
|
||||
| `api.ts` | Replace hardcoded Basic Auth with dynamic interceptor |
|
||||
| `router/index.ts` | Add `/login` route + `beforeEach` navigation guard |
|
||||
| `App.vue` | Add logout button to navbar |
|
||||
| `application.yaml` | Add `app.auth.username` property with default |
|
||||
|
||||
## Backend Setup
|
||||
|
||||
### 1. Add username to application.yaml
|
||||
|
||||
```yaml
|
||||
app:
|
||||
auth:
|
||||
username: ${APP_AUTH_USERNAME:neurosurgeon}
|
||||
password: ${APP_AUTH_PASSWORD} # already present
|
||||
```
|
||||
|
||||
### 2. Update SecurityConfig.java
|
||||
|
||||
Inject both username and password:
|
||||
|
||||
```java
|
||||
@Bean
|
||||
public UserDetailsService userDetailsService(
|
||||
@Value("${app.auth.username}") String username,
|
||||
@Value("${app.auth.password}") String password) {
|
||||
UserDetails user = User.builder()
|
||||
.username(username)
|
||||
.password("{noop}" + password)
|
||||
.roles("USER")
|
||||
.build();
|
||||
return new InMemoryUserDetailsManager(user);
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Add AuthController.java
|
||||
|
||||
```java
|
||||
@RestController
|
||||
@RequestMapping("/api/v1/auth")
|
||||
public class AuthController {
|
||||
@GetMapping("/check")
|
||||
public ResponseEntity<Map<String, String>> check(Principal principal) {
|
||||
return ResponseEntity.ok(Map.of("username", principal.getName()));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Frontend Setup
|
||||
|
||||
### 1. Create authStore.ts
|
||||
|
||||
```typescript
|
||||
// src/stores/authStore.ts
|
||||
import { defineStore } from 'pinia'
|
||||
import { ref, computed } from 'vue'
|
||||
|
||||
const SESSION_KEY = 'auth'
|
||||
|
||||
export const useAuthStore = defineStore('auth', () => {
|
||||
const stored = sessionStorage.getItem(SESSION_KEY)
|
||||
const parsed = stored ? JSON.parse(stored) : null
|
||||
|
||||
const username = ref<string | null>(parsed?.username ?? null)
|
||||
const password = ref<string | null>(parsed?.password ?? null)
|
||||
|
||||
const isAuthenticated = computed(() => !!username.value && !!password.value)
|
||||
|
||||
function setCredentials(u: string, p: string) {
|
||||
username.value = u
|
||||
password.value = p
|
||||
sessionStorage.setItem(SESSION_KEY, JSON.stringify({ username: u, password: p }))
|
||||
}
|
||||
|
||||
function clearCredentials() {
|
||||
username.value = null
|
||||
password.value = null
|
||||
sessionStorage.removeItem(SESSION_KEY)
|
||||
}
|
||||
|
||||
return { username, password, isAuthenticated, setCredentials, clearCredentials }
|
||||
})
|
||||
```
|
||||
|
||||
### 2. Update api.ts
|
||||
|
||||
Replace hardcoded `auth` with a request interceptor:
|
||||
|
||||
```typescript
|
||||
import axios from 'axios'
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
|
||||
export const api = axios.create({
|
||||
baseURL: import.meta.env.VITE_API_URL ?? '/api/v1',
|
||||
headers: { 'Content-Type': 'application/json' }
|
||||
})
|
||||
|
||||
api.interceptors.request.use((config) => {
|
||||
const auth = useAuthStore()
|
||||
if (auth.username && auth.password) {
|
||||
config.auth = { username: auth.username, password: auth.password }
|
||||
}
|
||||
return config
|
||||
})
|
||||
|
||||
api.interceptors.response.use(
|
||||
(response) => response,
|
||||
(error) => {
|
||||
if (error.response?.status === 401) {
|
||||
useAuthStore().clearCredentials()
|
||||
window.location.href = '/login'
|
||||
}
|
||||
const message = error.response?.data?.error ?? error.message ?? 'An unexpected error occurred.'
|
||||
return Promise.reject(new Error(message))
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
### 3. Update router/index.ts
|
||||
|
||||
Add `/login` route and guard:
|
||||
|
||||
```typescript
|
||||
import LoginView from '@/views/LoginView.vue'
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
|
||||
// add to routes array:
|
||||
{ path: '/login', name: 'login', component: LoginView }
|
||||
|
||||
// add global guard:
|
||||
router.beforeEach((to) => {
|
||||
const auth = useAuthStore()
|
||||
if (to.name !== 'login' && !auth.isAuthenticated) {
|
||||
return { name: 'login' }
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
### 4. Create LoginView.vue
|
||||
|
||||
A simple centered form with username and password fields. On submit:
|
||||
1. Store credentials tentatively in the auth store
|
||||
2. Call `GET /api/v1/auth/check`
|
||||
3. If 200 → navigate to `/`
|
||||
4. If 401 → clear credentials, show error message
|
||||
|
||||
### 5. Add logout to App.vue navbar
|
||||
|
||||
```html
|
||||
<button class="btn btn-secondary" @click="logout">Sign out</button>
|
||||
```
|
||||
|
||||
```typescript
|
||||
import { useAuthStore } from '@/stores/authStore'
|
||||
import { useRouter } from 'vue-router'
|
||||
const auth = useAuthStore()
|
||||
const router = useRouter()
|
||||
function logout() {
|
||||
auth.clearCredentials()
|
||||
router.push({ name: 'login' })
|
||||
}
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
### Backend (.env / docker-compose environment)
|
||||
|
||||
```
|
||||
APP_AUTH_USERNAME=neurosurgeon # optional, defaults to neurosurgeon
|
||||
APP_AUTH_PASSWORD=your-secret
|
||||
```
|
||||
|
||||
### Frontend (.env)
|
||||
|
||||
```
|
||||
VITE_API_URL=/api/v1
|
||||
# VITE_APP_PASSWORD is no longer needed and should be removed
|
||||
```
|
||||
|
||||
## Testing the Login Flow
|
||||
|
||||
1. Open the app in an incognito window — should redirect to `/login`
|
||||
2. Enter wrong credentials → error message, stay on login
|
||||
3. Enter correct credentials → redirect to `/` (Library)
|
||||
4. Refresh the page → stay logged in
|
||||
5. Click "Sign out" → redirect to `/login`; back button shows login again (no cached page access)
|
||||
@@ -0,0 +1,64 @@
|
||||
# Research: Basic Login Protection
|
||||
|
||||
**Feature**: 003-basic-login
|
||||
**Date**: 2026-04-06
|
||||
|
||||
## Finding 1: Backend Auth Mechanism — Already Implemented
|
||||
|
||||
**Decision**: Keep existing HTTP Basic Auth (Spring Security, `SecurityConfig.java`).
|
||||
**Rationale**: Spring Security with HTTP Basic is already configured and working. The backend validates credentials on every API request. There is nothing to add except making the username configurable and adding a credential-check endpoint.
|
||||
**Alternatives considered**: Form-based login with server-side sessions — rejected because it adds session management complexity on the backend that is unnecessary for an SPA using HTTP Basic.
|
||||
|
||||
---
|
||||
|
||||
## Finding 2: Frontend Credential Storage — sessionStorage
|
||||
|
||||
**Decision**: Store entered username and password in browser `sessionStorage` via a Pinia store.
|
||||
**Rationale**:
|
||||
- `sessionStorage` persists across page refreshes (same tab) but is cleared when the tab is closed — this matches the expected session behavior (SC-004) without needing a server-side session or JWT.
|
||||
- Simpler than `localStorage` (no explicit logout needed to clear on browser close).
|
||||
- No additional dependencies required.
|
||||
|
||||
**Alternatives considered**:
|
||||
- `localStorage` — rejected: credentials would persist indefinitely across browser sessions, which is unexpected for a "login" flow.
|
||||
- In-memory (reactive ref only) — rejected: credentials lost on page refresh, violating SC-004.
|
||||
- Cookie-based session (server-side) — rejected: requires CSRF protection, session store, and more backend complexity; violates KISS.
|
||||
|
||||
---
|
||||
|
||||
## Finding 3: Credential Verification — Lightweight Backend Endpoint
|
||||
|
||||
**Decision**: Add `GET /api/v1/auth/check` that returns `200 OK` with `{"username": "..."}` for authenticated requests.
|
||||
**Rationale**: The frontend needs a way to verify that stored credentials are valid when the app loads (e.g., after a refresh). Without this, the first real API call would fail with a 401 and force a re-login on every refresh if credentials changed. This endpoint is protected by Spring Security like all others — no special logic needed.
|
||||
**Alternatives considered**:
|
||||
- Re-use any existing GET endpoint (e.g., `GET /api/v1/books`) — rejected: couples auth verification to a business endpoint; semantically wrong and fragile.
|
||||
- Intercept 401s globally and redirect to login — used as a fallback but not sufficient alone: the user would see a flash of the main UI before being redirected.
|
||||
|
||||
---
|
||||
|
||||
## Finding 4: Axios Integration — Request Interceptor
|
||||
|
||||
**Decision**: Replace the hardcoded `auth` field in `api.ts` with a dynamic request interceptor that reads credentials from the Pinia auth store at request time.
|
||||
**Rationale**: The current `api.ts` sets `auth: { username, password }` once at module initialisation from env vars. This must change so the login form's entered credentials are used. A request interceptor reads the store on every call, enabling logout (clear store → next request gets no credentials → 401 → redirect to login).
|
||||
**Alternatives considered**:
|
||||
- Recreate the axios instance after login — rejected: all existing services import the singleton `api`; recreating would require updating every import.
|
||||
|
||||
---
|
||||
|
||||
## Finding 5: Backend Username Configurability
|
||||
|
||||
**Decision**: Read username from `${app.auth.username:neurosurgeon}` in `SecurityConfig.java` (with "neurosurgeon" as default).
|
||||
**Rationale**: The spec (FR-012) requires credentials to be configurable. Currently the password is configurable via env var but the username is hardcoded. Adding a `@Value`-injected username field is a one-line change.
|
||||
**Alternatives considered**: None — this is the Spring Boot idiomatic approach already used for the password.
|
||||
|
||||
---
|
||||
|
||||
## Summary of Unknowns Resolved
|
||||
|
||||
| Unknown | Resolution |
|
||||
|---------|-----------|
|
||||
| Where to store credentials on the frontend | `sessionStorage` via Pinia |
|
||||
| How to verify credentials after page refresh | `GET /api/v1/auth/check` endpoint |
|
||||
| How to inject credentials into axios | Request interceptor in `api.ts` |
|
||||
| How to handle 401s globally | Response interceptor → redirect to `/login` |
|
||||
| Backend username configurability | `@Value("${app.auth.username:neurosurgeon}")` |
|
||||
@@ -0,0 +1,103 @@
|
||||
# Feature Specification: Basic Login Protection
|
||||
|
||||
**Feature Branch**: `003-basic-login`
|
||||
**Created**: 2026-04-06
|
||||
**Status**: Draft
|
||||
**Input**: User description: "Add simple and basic login (username and password) to protect the app."
|
||||
|
||||
## User Scenarios & Testing *(mandatory)*
|
||||
|
||||
### User Story 1 - Authenticate to Access the App (Priority: P1)
|
||||
|
||||
A user opens the application and is presented with a login screen. They enter their username and password and, upon successful authentication, gain access to the full application. Without logging in, no part of the application is accessible.
|
||||
|
||||
**Why this priority**: This is the core feature — all other functionality depends on this gate being in place.
|
||||
|
||||
**Independent Test**: Can be fully tested by navigating to any page without credentials (should redirect to login), then logging in with valid credentials (should grant access) — this alone delivers the full MVP value.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** an unauthenticated user, **When** they navigate to any page of the app, **Then** they are redirected to the login screen
|
||||
2. **Given** the login screen, **When** the user enters valid credentials and submits, **Then** they are redirected to the application home/dashboard
|
||||
3. **Given** the login screen, **When** the user enters invalid credentials and submits, **Then** an error message is displayed and they remain on the login screen
|
||||
4. **Given** an authenticated user, **When** they navigate directly to a protected page, **Then** they can access it without re-authenticating
|
||||
|
||||
---
|
||||
|
||||
### User Story 2 - Log Out of the App (Priority: P2)
|
||||
|
||||
An authenticated user can explicitly log out of the application, terminating their session. After logging out, they are redirected to the login screen and must re-authenticate to access the app.
|
||||
|
||||
**Why this priority**: Logout is essential for security — especially on shared machines — but the app is still protected even without explicit logout (session expires).
|
||||
|
||||
**Independent Test**: Can be fully tested by logging in, clicking logout, and confirming that the protected pages are no longer accessible.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** an authenticated user, **When** they click the logout button, **Then** their session is terminated and they are redirected to the login screen
|
||||
2. **Given** a user who has logged out, **When** they navigate to a protected page, **Then** they are redirected to the login screen
|
||||
|
||||
---
|
||||
|
||||
### User Story 3 - Session Persistence Across Browser Refresh (Priority: P3)
|
||||
|
||||
An authenticated user refreshes the page or reopens the browser tab and remains logged in without having to re-enter credentials, as long as their session has not expired.
|
||||
|
||||
**Why this priority**: Improves usability — users should not be forced to log in after every page refresh during normal use.
|
||||
|
||||
**Independent Test**: Can be tested by logging in, refreshing the page, and confirming the user is still authenticated.
|
||||
|
||||
**Acceptance Scenarios**:
|
||||
|
||||
1. **Given** an authenticated user, **When** they refresh the browser, **Then** they remain logged in and on the same page
|
||||
2. **Given** a session that has expired, **When** the user tries to access a protected page, **Then** they are redirected to the login screen
|
||||
|
||||
---
|
||||
|
||||
### Edge Cases
|
||||
|
||||
- What happens when the login form is submitted with empty username or password fields?
|
||||
- How does the system handle a user whose credentials are removed/disabled while they have an active session?
|
||||
- What happens if the user attempts to access the login page while already authenticated?
|
||||
- How does the system behave if the session store becomes unavailable?
|
||||
|
||||
## Requirements *(mandatory)*
|
||||
|
||||
### Functional Requirements
|
||||
|
||||
- **FR-001**: System MUST display a login screen (username + password fields and submit button) to unauthenticated users
|
||||
- **FR-002**: System MUST redirect unauthenticated users attempting to access any protected page to the login screen
|
||||
- **FR-003**: System MUST validate submitted credentials against configured or stored credentials
|
||||
- **FR-004**: System MUST create an authenticated session upon successful login
|
||||
- **FR-005**: System MUST display a clear, user-friendly error message when credentials are incorrect (without revealing which field is wrong)
|
||||
- **FR-006**: System MUST provide a logout action that terminates the active session
|
||||
- **FR-007**: System MUST redirect users to the login screen after logout
|
||||
- **FR-008**: System MUST prevent login form submission when username or password is empty
|
||||
- **FR-009**: System MUST automatically expire sessions after a reasonable inactivity period
|
||||
- **FR-010**: System MUST redirect users to the login page if their session has expired
|
||||
- **FR-011**: Credentials MUST be stored securely (passwords hashed, not stored in plaintext)
|
||||
- **FR-012**: System MUST allow at least one user account to be configured via environment variables or a configuration file; credential changes take effect on restart
|
||||
|
||||
### Key Entities
|
||||
|
||||
- **User Account**: Represents a person who can authenticate; has a username (unique identifier) and a hashed password
|
||||
- **Session**: Represents an active authenticated context; linked to a user account, has an expiry time
|
||||
|
||||
## Success Criteria *(mandatory)*
|
||||
|
||||
### Measurable Outcomes
|
||||
|
||||
- **SC-001**: Unauthenticated users cannot access any protected page — 100% of protected routes redirect to login
|
||||
- **SC-002**: Users can complete the login flow (enter credentials, submit, land on the app) in under 30 seconds under normal conditions
|
||||
- **SC-003**: Invalid login attempts display an error within 3 seconds and do not reveal which field was wrong
|
||||
- **SC-004**: Authenticated sessions persist across page refreshes for the configured session duration without requiring re-authentication
|
||||
- **SC-005**: Logout terminates the session immediately — any subsequent request to a protected page results in a redirect to login
|
||||
|
||||
## Assumptions
|
||||
|
||||
- The target users are a small, known group — there is no public self-registration for new accounts
|
||||
- A single set of credentials (or a small number of pre-configured accounts) is sufficient for this initial version
|
||||
- Mobile/responsive design for the login form is expected but full mobile optimization is not the focus of this feature
|
||||
- The app currently has no authentication layer, so this will be added globally
|
||||
- Session duration defaults to a reasonable inactivity timeout (e.g., 30 minutes of inactivity), configurable if needed
|
||||
- A "remember me" / persistent login cookie is out of scope for this initial implementation
|
||||
@@ -0,0 +1,172 @@
|
||||
# Tasks: Basic Login Protection
|
||||
|
||||
**Input**: Design documents from `/specs/003-basic-login/`
|
||||
**Prerequisites**: plan.md ✅ spec.md ✅ research.md ✅ data-model.md ✅ contracts/ ✅ quickstart.md ✅
|
||||
|
||||
**Tests**: Not requested — no test tasks included.
|
||||
|
||||
**Organization**: Tasks grouped by user story to enable independent implementation and testing.
|
||||
|
||||
## Format: `[ID] [P?] [Story] Description`
|
||||
|
||||
- **[P]**: Can run in parallel (different files, no dependencies on incomplete tasks)
|
||||
- **[Story]**: Which user story this task belongs to (US1, US2, US3)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Setup (Shared Infrastructure)
|
||||
|
||||
**Purpose**: No new project initialization needed — existing backend and frontend projects are in place. Phase 1 confirms the entry points for changes.
|
||||
|
||||
- [x] T001 Verify `spring-boot-starter-security` is present in `backend/pom.xml` (already included — confirm, no change needed)
|
||||
- [x] T002 Verify Pinia is listed in `frontend/package.json` dependencies (already included — confirm, no change needed)
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Foundational (Blocking Prerequisites)
|
||||
|
||||
**Purpose**: Backend credential endpoint and frontend auth store — required by all three user stories.
|
||||
|
||||
**⚠️ CRITICAL**: No user story work can begin until this phase is complete.
|
||||
|
||||
- [x] T003 Add `app.auth.username` property to `backend/src/main/resources/application.yaml` with value `${APP_AUTH_USERNAME:neurosurgeon}` alongside the existing `app.auth.password` entry
|
||||
- [x] T004 Update `backend/src/main/java/com/aiteacher/config/SecurityConfig.java` to inject `@Value("${app.auth.username}")` and pass it to `User.builder().username(username)` instead of the hardcoded string `"neurosurgeon"`
|
||||
- [x] T005 Create `backend/src/main/java/com/aiteacher/auth/AuthController.java` — `@RestController` at `/api/v1/auth`, with a single `GET /check` method that accepts a `Principal` argument and returns `ResponseEntity.ok(Map.of("username", principal.getName()))`
|
||||
- [x] T006 Create `frontend/src/stores/authStore.ts` — Pinia store with `username` and `password` refs (initially `null`), `isAuthenticated` computed, `setCredentials(u, p)` action, and `clearCredentials()` action (sessionStorage persistence added in Phase 5 / US3)
|
||||
|
||||
**Checkpoint**: Backend exposes `GET /api/v1/auth/check`; `authStore` is callable from any Vue component.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: User Story 1 - Authenticate to Access the App (Priority: P1) 🎯 MVP
|
||||
|
||||
**Goal**: Unauthenticated users are redirected to `/login`; successful credential entry grants full app access.
|
||||
|
||||
**Independent Test**: Open the app in incognito → redirected to `/login`. Enter wrong credentials → error shown. Enter correct credentials → land on Library (`/`). Refresh the page → stays on Library (credentials held in-memory for now; persistence comes in US3).
|
||||
|
||||
### Implementation for User Story 1
|
||||
|
||||
- [x] T007 [US1] Create `frontend/src/views/LoginView.vue` — centered card with username input, password input, submit button, and an error message area; on submit, call `authStore.setCredentials(u, p)`, then call `GET /api/v1/auth/check` via the `api` service; on 200 navigate to `/`; on failure call `authStore.clearCredentials()` and display the error
|
||||
- [x] T008 [US1] Update `frontend/src/services/api.ts` — remove the hardcoded `auth: { username, password }` field from the axios instance; add a **request interceptor** that reads `authStore.username` and `authStore.password` and sets `config.auth` dynamically; add a **response interceptor** that on `401` calls `authStore.clearCredentials()` and redirects to `/login` (replace the existing error-normalisation interceptor rather than adding a second one — keep error normalisation intact)
|
||||
- [x] T009 [US1] Update `frontend/src/router/index.ts` — add a `{ path: '/login', name: 'login', component: LoginView }` route; add a `router.beforeEach` guard that redirects to `{ name: 'login' }` when `to.name !== 'login'` and `!authStore.isAuthenticated`
|
||||
|
||||
**Checkpoint**: US1 fully functional — incognito flow, failed login, and successful login all work independently.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: User Story 2 - Log Out of the App (Priority: P2)
|
||||
|
||||
**Goal**: Authenticated users can log out, terminating their session and returning to `/login`.
|
||||
|
||||
**Independent Test**: Log in → click "Sign out" in the navbar → redirected to `/login`; navigating back to any protected route redirects to `/login` again.
|
||||
|
||||
### Implementation for User Story 2
|
||||
|
||||
- [x] T010 [US2] Update `frontend/src/App.vue` — import `useAuthStore` and `useRouter`; add a "Sign out" button to the navbar (visible only when `authStore.isAuthenticated`); clicking it calls `authStore.clearCredentials()` then `router.push({ name: 'login' })`; hide the navbar links (`RouterLink` items) when on the login page by checking `$route.name !== 'login'`
|
||||
|
||||
**Checkpoint**: US2 fully functional — logout clears session and blocks re-entry without credentials.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: User Story 3 - Session Persistence Across Browser Refresh (Priority: P3)
|
||||
|
||||
**Goal**: Authenticated users survive a page refresh without re-logging in; expired/invalid stored credentials redirect to `/login`.
|
||||
|
||||
**Independent Test**: Log in → refresh the browser → remain on the same page without re-entering credentials. Manually clear `sessionStorage` and refresh → redirected to `/login`.
|
||||
|
||||
### Implementation for User Story 3
|
||||
|
||||
- [x] T011 [US3] Update `frontend/src/stores/authStore.ts` — in `setCredentials`, write `{ username, password }` to `sessionStorage` under a key (e.g. `'auth'`); in `clearCredentials`, call `sessionStorage.removeItem('auth')`; on store initialization (module load), read from `sessionStorage` and pre-populate `username` and `password` refs if present
|
||||
- [x] T012 [US3] Update `frontend/src/main.ts` — after creating the app and mounting Pinia, call `authStore.restoreSession()` (or inline the check): if `authStore.isAuthenticated`, call `GET /api/v1/auth/check`; if the response is `401`, call `authStore.clearCredentials()` so stale stored credentials are evicted before the router guard fires
|
||||
|
||||
**Checkpoint**: US3 fully functional — refresh persists login; stale or invalidated credentials are detected on load.
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Polish & Cross-Cutting Concerns
|
||||
|
||||
**Purpose**: Documentation and cleanup that spans all user stories.
|
||||
|
||||
- [x] T013 [P] Update `frontend/.env.example` — remove `VITE_APP_PASSWORD` (the frontend no longer reads a password from env; add a comment explaining credentials are now entered via the login form)
|
||||
- [x] T014 [P] Update `README.md` — add or update the Mermaid architecture diagram to show the login flow: browser → login form → `/api/v1/auth/check` → app; this satisfies Constitution Principle IV (diagram must be updated in the same PR as any architectural change)
|
||||
|
||||
**Checkpoint**: Feature complete — all three user stories functional, documentation current, obsolete env var removed.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Execution Order
|
||||
|
||||
### Phase Dependencies
|
||||
|
||||
- **Setup (Phase 1)**: No dependencies — confirm immediately
|
||||
- **Foundational (Phase 2)**: Depends on Phase 1 — **BLOCKS all user stories**
|
||||
- **US1 (Phase 3)**: Depends on Phase 2 completion
|
||||
- **US2 (Phase 4)**: Depends on Phase 2 completion; integrates with authStore from Phase 2 and router from US1
|
||||
- **US3 (Phase 5)**: Depends on Phase 2 completion; extends authStore from Phase 2
|
||||
- **Polish (Phase 6)**: Depends on all desired stories complete
|
||||
|
||||
### User Story Dependencies
|
||||
|
||||
- **US1 (P1)**: Depends only on Foundational — the primary blocker for all UI work
|
||||
- **US2 (P2)**: Depends on Foundational and US1 (logout button lives in App.vue which needs the router guard from US1)
|
||||
- **US3 (P3)**: Depends on Foundational only — authStore persistence is independent of the login form; can be developed in parallel with US1 if desired
|
||||
|
||||
### Within Each User Story
|
||||
|
||||
- Foundational tasks (T003–T006) must all complete before US1 starts
|
||||
- T007 (LoginView) and T008 (api.ts) can be developed in parallel within US1; T009 (router guard) depends on T007 existing
|
||||
- US2 is a single task (T010) with no internal ordering complexity
|
||||
- T012 (main.ts restore check) depends on T011 (authStore persistence) within US3
|
||||
|
||||
---
|
||||
|
||||
## Parallel Example: Foundational Phase
|
||||
|
||||
```
|
||||
Parallelizable within Phase 2:
|
||||
T003 — application.yaml update
|
||||
T004 — SecurityConfig.java update
|
||||
T005 — AuthController.java (new file)
|
||||
T006 — authStore.ts (new file)
|
||||
All four touch different files with no shared dependencies.
|
||||
```
|
||||
|
||||
## Parallel Example: User Story 1
|
||||
|
||||
```
|
||||
Parallelizable within Phase 3:
|
||||
T007 — LoginView.vue (new file)
|
||||
T008 — api.ts update (different file)
|
||||
Then sequentially:
|
||||
T009 — router/index.ts (depends on LoginView existing)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### MVP First (User Story 1 Only)
|
||||
|
||||
1. Complete Phase 1: Confirm existing deps
|
||||
2. Complete Phase 2: Foundational — backend endpoint + auth store
|
||||
3. Complete Phase 3: US1 — login page, axios interceptors, router guard
|
||||
4. **STOP and VALIDATE**: Open incognito, verify redirect → login → success flow
|
||||
5. Demo / merge if MVP is sufficient
|
||||
|
||||
### Incremental Delivery
|
||||
|
||||
1. Phase 1 + Phase 2 → Foundation ready
|
||||
2. Phase 3 (US1) → Login gate works — **MVP** ✅
|
||||
3. Phase 4 (US2) → Logout works ✅
|
||||
4. Phase 5 (US3) → Session survives refresh ✅
|
||||
5. Phase 6 → Documentation and cleanup ✅
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- [P] tasks touch different files and have no dependency on an incomplete sibling task in the same phase
|
||||
- No tests included (not requested in the spec)
|
||||
- `VITE_APP_PASSWORD` should be removed from `.env.example` once T013 is done — do **not** remove it from any local `.env` file before the login form is working (T007–T009 complete)
|
||||
- The 401 response interceptor in T008 handles the edge case where stored credentials become invalid server-side — no additional handling needed
|
||||
- Constitution IV requires the README Mermaid diagram to be updated in the **same PR** — T014 must not be skipped
|
||||
@@ -0,0 +1,34 @@
|
||||
# Specification Quality Checklist: RAG Retrieval Quality Improvements
|
||||
|
||||
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
||||
**Created**: 2026-04-06
|
||||
**Feature**: [spec.md](../spec.md)
|
||||
|
||||
## Content Quality
|
||||
|
||||
- [x] No implementation details (languages, frameworks, APIs)
|
||||
- [x] Focused on user value and business needs
|
||||
- [x] Written for non-technical stakeholders
|
||||
- [x] All mandatory sections completed
|
||||
|
||||
## Requirement Completeness
|
||||
|
||||
- [x] No [NEEDS CLARIFICATION] markers remain
|
||||
- [x] Requirements are testable and unambiguous
|
||||
- [x] Success criteria are measurable
|
||||
- [x] Success criteria are technology-agnostic (no implementation details)
|
||||
- [x] All acceptance scenarios are defined
|
||||
- [x] Edge cases are identified
|
||||
- [x] Scope is clearly bounded
|
||||
- [x] Dependencies and assumptions identified
|
||||
|
||||
## Feature Readiness
|
||||
|
||||
- [x] All functional requirements have clear acceptance criteria
|
||||
- [x] User scenarios cover primary flows
|
||||
- [x] Feature meets measurable outcomes defined in Success Criteria
|
||||
- [x] No implementation details leak into specification
|
||||
|
||||
## Notes
|
||||
|
||||
- All items pass. Ready to proceed to `/speckit.clarify` or `/speckit.plan`.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user