10 KiB
Feature Specification: Neurosurgeon RAG Learning Platform
Feature Branch: 001-neuro-rag-learning
Created: 2026-03-31
Status: Draft
Input: User description: "I want to build a web application to help learn complex topics for neurosurgeons. I will provide to the system a predefined list of topics. The system let user upload books. The books should be embedded to be used latter for LLM (RAG). Embedding is crucial, it MUST be precise, with diagram. The user can select a topics, a LLM will provide a summary by crossing information from the uploaded books. User can have a chat to deepen their knowledge"
User Scenarios & Testing (mandatory)
User Story 1 - Book Upload & Precise Embedding (Priority: P1)
A neurosurgeon uploads a medical textbook (PDF) to the platform. The system processes the book, extracting both textual content and embedded diagrams/figures with high fidelity. Once processing is complete, the book is available as a knowledge source for topic summaries and chat. The user can see the upload status and a confirmation that the book is ready.
Why this priority: Without embedded books there is no knowledge base. All other stories depend on this story being complete and functional.
Independent Test: Upload a single PDF textbook. Verify the upload is accepted, processing completes, and the book appears in the library as "Ready". Then ask a topic question whose answer appears only in that book and confirm the correct answer surfaces.
Acceptance Scenarios:
- Given a user on the upload page, When they select a valid PDF file and confirm upload, Then the system accepts the file, shows a processing progress indicator, and eventually marks the book as "Ready" in the library.
- Given a book that contains anatomical diagrams, When the system finishes embedding, Then diagram content (labels, captions, spatial relationships described in the diagram) is searchable and retrievable alongside the surrounding text.
- Given an upload of an unsupported file format (e.g., DOCX), When the user submits, Then the system rejects the file with a clear error message explaining accepted formats.
- Given a book is currently being processed, When the user navigates to the library, Then the book appears with a "Processing" status and cannot yet be used as a knowledge source.
User Story 2 - Topic-Guided Summary (Priority: P2)
The user browses a predefined list of neurosurgery topics, selects one (e.g., "Cerebral Aneurysm Management"), and receives an AI-generated summary that cross-references all uploaded books. The summary synthesizes information from multiple sources, citing which book each piece of information comes from.
Why this priority: This is the core learning feature — the primary reason a neurosurgeon uses the platform. It delivers immediate value once at least one book is embedded.
Independent Test: With at least one book covering the target topic embedded, select that topic from the list and confirm: (a) a coherent summary is generated, (b) the summary references content present in the uploaded book(s), and (c) source citations are visible.
Acceptance Scenarios:
- Given at least one book is in "Ready" state, When the user selects a topic from the predefined list, Then the system generates a summary within 30 seconds that draws on content from the uploaded books.
- Given multiple books are uploaded, When the user requests a topic summary, Then the summary synthesizes information from all relevant books and indicates which source each key point came from.
- Given no uploaded book contains content relevant to the selected topic, When the user requests a summary, Then the system clearly communicates that its knowledge is limited and no relevant source was found.
- Given a topic summary is displayed, When the user inspects a cited passage, Then they can identify the originating book title and approximate location (chapter or page range).
User Story 3 - Knowledge Deepening Chat (Priority: P3)
After reading a topic summary (or independently), the user enters a conversational chat to ask follow-up questions. The AI answers using the embedded books as its exclusive knowledge source, enabling the user to drill into specific areas, request clarifications, or explore edge cases.
Why this priority: The chat extends the value of the summary by enabling personalised, interactive learning. It builds on the RAG infrastructure established by P1 and P2.
Independent Test: Start a chat session on a specific topic. Ask a specific clinical question whose answer is in an uploaded book. Confirm the response references the correct source and that the conversation maintains context across at least 3 turns.
Acceptance Scenarios:
- Given a user in a chat session, When they ask a question relevant to an uploaded book, Then the system responds with a grounded answer and cites the source book.
- Given an ongoing chat, When the user asks a follow-up question that refers to a previous turn ("What about the complication you just mentioned?"), Then the system maintains conversational context and provides a coherent answer.
- Given a user asks a question outside the scope of any uploaded book, When the system responds, Then it clearly states that no relevant source was found rather than generating unsupported claims.
- Given a chat session is ongoing, When the user wishes to start fresh, Then they can clear the conversation history and begin a new session.
Edge Cases
- What happens when a PDF is corrupted or password-protected?
- How does the system handle books that are very large (500+ pages)?
- What if two uploaded books contain contradictory information on the same topic?
- How does diagram embedding behave if a diagram has no caption or label?
Requirements (mandatory)
Functional Requirements
- FR-001: System MUST allow users to upload books in PDF format.
- FR-002: System MUST extract and embed textual content from uploaded books with high precision for use in semantic search.
- FR-003: System MUST extract and embed visual content (diagrams, figures, and their captions/labels) from uploaded books so diagram information is retrievable by the RAG system.
- FR-004: System MUST display a predefined, curated list of neurosurgery topics for user selection.
- FR-005: System MUST generate a topic summary by cross-referencing all "Ready" uploaded books and synthesizing relevant passages.
- FR-006: System MUST cite the source book (and approximate location) for each key claim in a generated summary.
- FR-007: System MUST provide a conversational chat interface where AI answers are grounded exclusively in uploaded book content.
- FR-008: System MUST maintain conversational context within a chat session across multiple turns.
- FR-009: System MUST display the embedding/processing status of each uploaded book (Pending, Processing, Ready, Failed).
- FR-010: System MUST reject uploaded files that are not in a supported format and provide a clear error message.
- FR-011: System MUST communicate clearly when a query cannot be answered from the available book content, rather than generating unsupported claims.
- FR-012: The book library MUST be a single shared global library; all users see and benefit from the same uploaded books. Per-user isolation is out of scope for the POC.
Key Entities
- Book: Uploaded document. Attributes: title, file name, upload date, processing status (Pending / Processing / Ready / Failed), page count.
- Topic: Predefined learning subject. Attributes: name, description, category. Managed via configuration (not user-editable in the POC).
- Embedding Chunk: A semantically coherent unit of book content (text passage or diagram + caption) with its vector representation and source reference.
- Chat Session: A conversation thread. Attributes: creation date, associated topic (optional), message history.
- Message: A single turn in a chat session. Attributes: role (user / assistant), content, source citations (for assistant messages).
Success Criteria (mandatory)
Measurable Outcomes
- SC-001: A user can upload a book and have it fully processed and searchable within 10 minutes for a standard-length medical textbook (up to 500 pages).
- SC-002: Topic summaries are generated within 30 seconds of user request.
- SC-003: At least 90% of generated summary claims can be traced back to a cited passage in an uploaded book.
- SC-004: A user completing the primary flow (upload → topic summary → chat) requires no external instructions — the interface is self-explanatory.
- SC-005: Diagram content from uploaded books is retrievable by the RAG system; at least one diagram-sourced fact surfaces correctly in a controlled test query.
- SC-006: The system correctly declines to answer (and explains why) when a question has no grounding in uploaded books, in at least 9 out of 10 out-of-scope test queries.
Assumptions
- Books are uploaded as PDF files; other formats (EPUB, DOCX) are out of scope for the POC.
- The predefined topic list is small (10–50 topics) and curated manually by the project owner via a configuration file; no admin UI is needed for the POC.
- Access is protected by a simple shared password or API token (no individual user accounts); anyone who knows the credential can access the application. Full account management is out of scope for the POC.
- The LLM used for summary generation and chat is accessed via an external API (not self-hosted); the specific provider is a technical implementation decision.
- Diagram embedding means extracting diagram images and their associated captions/labels as descriptive text for semantic search; pixel-level image similarity search is out of scope for the POC.
- The system is designed for a small number of concurrent users (POC scale: < 10 simultaneous users); horizontal scaling is not a requirement at this stage.
- Internet connectivity is assumed for both the user and the server (external LLM API calls).