[FEAT]: Add Right Pane for Cited Material Display in Chat UI (RAG Support) #2426

Open
opened 2026-02-28 06:07:29 -05:00 by deekerman · 5 comments
Owner

Originally created by @ROlwig on GitHub (Apr 29, 2025).

What would you like to see?

Is your feature request related to a problem? Please describe.
Somewhat. In (RAG) use cases, cited documents are currently embedded inline in messages, which disrupts reading and makes it hard to reference sources—especially for longer documents, like PDF Textbooks.


Describe the solution you'd like
Add a right-side pane to the UI that shows retrieved or cited material relevant to the active chat message. Ideal functionality includes:
• Collapsible right pane that previews cited excerpts or document snippets.
• Displays citation metadata (title, page, etc.).
• Clicking a citation in chat scrolls or highlights that content in the pane including the specific page from the metadata.
• Option to pin or toggle the pane as needed.


Describe alternatives you've considered
• The current Inline citations are fine for short text but hard to scan or review in longer responses.
• Pop-up modals or external links: break the conversational flow.


Additional context
Very few open-source UIs offer this feature. Adding it would improve usability for education, research, and compliance settings—where seeing the source matters.
Example use case: In an AI tutor scenario, a student asks a question. The AI replies with an answer based on an uploaded textbook. The right pane shows the relevant excerpt, helping the student verify and understand the response without scrolling through the entire chat.

Image

Originally created by @ROlwig on GitHub (Apr 29, 2025). ### What would you like to see? **Is your feature request related to a problem? Please describe.** Somewhat. In (RAG) use cases, cited documents are currently embedded inline in messages, which disrupts reading and makes it hard to reference sources—especially for longer documents, like PDF Textbooks. ________________________________________ **Describe the solution you'd like** Add a right-side pane to the UI that shows retrieved or cited material relevant to the active chat message. Ideal functionality includes: • Collapsible right pane that previews cited excerpts or document snippets. • Displays citation metadata (title, page, etc.). • Clicking a citation in chat scrolls or highlights that content in the pane including the specific page from the metadata. • Option to pin or toggle the pane as needed. ________________________________________ **Describe alternatives you've considered** • The current Inline citations are fine for short text but hard to scan or review in longer responses. • Pop-up modals or external links: break the conversational flow. ________________________________________ **Additional context** _Very few open-source UIs offer this feature_. Adding it would improve usability for education, research, and compliance settings—where seeing the source matters. Example use case: In an AI tutor scenario, a student asks a question. The AI replies with an answer based on an uploaded textbook. The right pane shows the relevant excerpt, helping the student verify and understand the response without scrolling through the entire chat. ![Image](https://github.com/user-attachments/assets/64c6731f-d032-4e20-b354-83ce3ace22b3)
Author
Owner

@kalle07 commented on GitHub (May 1, 2025):

  • Add option to clear received snippets after every answer
  • Add option to embedd one document in two diffferent chunk-sizes for fast switch eg between short and long snippets
  • Add option to clear chached doc, without delete and load again
  • Add option, all options, and all above in level-workspace ... not global

;)

@kalle07 commented on GitHub (May 1, 2025): - Add option to clear received snippets after every answer - Add option to embedd one document in two diffferent chunk-sizes for fast switch eg between short and long snippets - Add option to clear chached doc, without delete and load again - Add option, all options, and all above in level-workspace ... not global ;)
Author
Owner

@ROlwig commented on GitHub (May 9, 2025):

I realize this may not be as easy as I thought.

From one of our data science / devs who has done a lot of RAG work. I hope this helps:

What can be easily done

  1. Showing page numbers
  • Page numbers can be added during PDF-to-text conversion and stored as metadata
  • This lets us display the document title and page alongside each retrieved snippet
  1. Scrolling to the right snippet and clickable citations for panel view
  • We can tag each snippet with a unique ID or anchor (e.g., Doc1_Page3). When a citation is clicked, the viewer panel can scroll to and highlight that section
  1. Performance of the viewer panel and UX
  • Only minimal citation info (e.g., title, page number, short preview) is shown initially
  • Full snippets load dynamically when a user clicks or expands them
  1. Handling poor-quality or scanned PDFs
  • Structured PDFs with readable text are easy to process for page info and chunking

What can be a challenge

  1. Showing page numbers
  • Many PDFs don’t preserve clean structure; text chunks may not map neatly to a single page
  • Simple chunking (e.g., every 500 tokens) can split content across pages, making page tracking unreliable
  1. Scrolling to the right snippet and clickable citations for panel view
  • A generated response may contain several citations; user might click one and expect the viewer to jump without disrupting others
  • If citations are from different pages, scroll behavior may feel erratic
  1. Performance of the viewer panel and UX
    Showing long excerpts for every message can slow down the UI and cause lag
  • Payload size becomes a concern as more citations are loaded inline
  1. Handling poor-quality or scanned PDFs
  • Scanned image PDFs or poorly formatted files can’t be parsed properly
  • Without clean/machine-readable text, we can’t extract citations or page numbers

What can be engineered/custom built

  1. Showing page numbers
  • Use PDF parsers that preserve page breaks (e.g., PyMuPDF)
  • Add page markers like <<PAGE: 3>> into text before embedding
  • Ensure chunking aligns with actual document structure, not just token size
  1. Scrolling to the right snippet and clickable citations for panel view
  • Build a scroll manager and state tracker that keeps the right viewer pane in sync with chat
  • Use expandable blocks (accordion or tabs) in the right panel to manage multiple citations clearly
  1. Performance of the viewer panel and UX
  • Use lazy-loading logic in the front end
  • Load full citation content on-demand (when clicked), not upfront
  • Render only a short preview unless expanded
  1. Handling poor-quality or scanned PDFs
    -Add OCR-based fallback (e.g., using Tesseract) for image-based PDFs
  • Validate PDF structure during upload and flag unusable files early
@ROlwig commented on GitHub (May 9, 2025): I realize this may not be as easy as I thought. From one of our data science / devs who has done a lot of RAG work. I hope this helps: **What can be easily done** 1. Showing page numbers - Page numbers can be added during PDF-to-text conversion and stored as metadata - This lets us display the document title and page alongside each retrieved snippet 2. Scrolling to the right snippet and clickable citations for panel view - We can tag each snippet with a unique ID or anchor (e.g., Doc1_Page3). When a citation is clicked, the viewer panel can scroll to and highlight that section 3. Performance of the viewer panel and UX - Only minimal citation info (e.g., title, page number, short preview) is shown initially - Full snippets load dynamically when a user clicks or expands them 4. Handling poor-quality or scanned PDFs - Structured PDFs with readable text are easy to process for page info and chunking **What can be a challenge** 1. Showing page numbers - Many PDFs don’t preserve clean structure; text chunks may not map neatly to a single page - Simple chunking (e.g., every 500 tokens) can split content across pages, making page tracking unreliable 2. Scrolling to the right snippet and clickable citations for panel view - A generated response may contain several citations; user might click one and expect the viewer to jump without disrupting others - If citations are from different pages, scroll behavior may feel erratic 3. Performance of the viewer panel and UX Showing long excerpts for every message can slow down the UI and cause lag - Payload size becomes a concern as more citations are loaded inline 4. Handling poor-quality or scanned PDFs - Scanned image PDFs or poorly formatted files can’t be parsed properly - Without clean/machine-readable text, we can’t extract citations or page numbers **What can be engineered/custom built** 1. Showing page numbers - Use PDF parsers that preserve page breaks (e.g., PyMuPDF) - Add page markers like <<PAGE: 3>> into text before embedding - Ensure chunking aligns with actual document structure, not just token size 2. Scrolling to the right snippet and clickable citations for panel view - Build a scroll manager and state tracker that keeps the right viewer pane in sync with chat - Use expandable blocks (accordion or tabs) in the right panel to manage multiple citations clearly 3. Performance of the viewer panel and UX - Use lazy-loading logic in the front end - Load full citation content on-demand (when clicked), not upfront - Render only a short preview unless expanded 4. Handling poor-quality or scanned PDFs -Add OCR-based fallback (e.g., using Tesseract) for image-based PDFs - Validate PDF structure during upload and flag unusable files early
Author
Owner

@kalle07 commented on GitHub (May 9, 2025):

@ROlwig - i have 2 option to parse you are on discord?

@kalle07 commented on GitHub (May 9, 2025): @ROlwig - i have 2 option to parse you are on discord?
Author
Owner

@ROlwig commented on GitHub (Jun 7, 2025):

yes...thank you @kalle07.

@ROlwig commented on GitHub (Jun 7, 2025): yes...thank you @kalle07.
Author
Owner

@kalle07 commented on GitHub (Jun 7, 2025):

@ROlwig go here to check out if you like ;)'
https://huggingface.co/kalle07/pdf2txt_parser_converter

@kalle07 commented on GitHub (Jun 7, 2025): @ROlwig go here to check out if you like ;)' https://huggingface.co/kalle07/pdf2txt_parser_converter
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/anything-llm#2426
No description provided.