starred/anything-llm

Fork 0

mirror of https://github.com/Mintplex-Labs/anything-llm.git synced 2026-03-02 22:57:05 -05:00

[FEAT]: Add Right Pane for Cited Material Display in Chat UI (RAG Support) #2426

New issue

Open

opened 2026-02-28 06:07:29 -05:00 by deekerman · 5 comments

deekerman commented

2026-02-28 06:07:29 -05:00

Owner

Originally created by @ROlwig on GitHub (Apr 29, 2025).

What would you like to see?

Is your feature request related to a problem? Please describe.
Somewhat. In (RAG) use cases, cited documents are currently embedded inline in messages, which disrupts reading and makes it hard to reference sources—especially for longer documents, like PDF Textbooks.

Describe the solution you'd like
Add a right-side pane to the UI that shows retrieved or cited material relevant to the active chat message. Ideal functionality includes:
• Collapsible right pane that previews cited excerpts or document snippets.
• Displays citation metadata (title, page, etc.).
• Clicking a citation in chat scrolls or highlights that content in the pane including the specific page from the metadata.
• Option to pin or toggle the pane as needed.

Describe alternatives you've considered
• The current Inline citations are fine for short text but hard to scan or review in longer responses.
• Pop-up modals or external links: break the conversational flow.

Additional context
Very few open-source UIs offer this feature. Adding it would improve usability for education, research, and compliance settings—where seeing the source matters.
Example use case: In an AI tutor scenario, a student asks a question. The AI replies with an answer based on an uploaded textbook. The right pane shows the relevant excerpt, helping the student verify and understand the response without scrolling through the entire chat.

Originally created by @ROlwig on GitHub (Apr 29, 2025). ### What would you like to see? **Is your feature request related to a problem? Please describe.** Somewhat. In (RAG) use cases, cited documents are currently embedded inline in messages, which disrupts reading and makes it hard to reference sources—especially for longer documents, like PDF Textbooks. ________________________________________ **Describe the solution you'd like** Add a right-side pane to the UI that shows retrieved or cited material relevant to the active chat message. Ideal functionality includes: • Collapsible right pane that previews cited excerpts or document snippets. • Displays citation metadata (title, page, etc.). • Clicking a citation in chat scrolls or highlights that content in the pane including the specific page from the metadata. • Option to pin or toggle the pane as needed. ________________________________________ **Describe alternatives you've considered** • The current Inline citations are fine for short text but hard to scan or review in longer responses. • Pop-up modals or external links: break the conversational flow. ________________________________________ **Additional context** _Very few open-source UIs offer this feature_. Adding it would improve usability for education, research, and compliance settings—where seeing the source matters. Example use case: In an AI tutor scenario, a student asks a question. The AI replies with an answer based on an uploaded textbook. The right pane shows the relevant excerpt, helping the student verify and understand the response without scrolling through the entire chat. ![Image](https://github.com/user-attachments/assets/64c6731f-d032-4e20-b354-83ce3ace22b3)

deekerman added the

enhancement

UI/UX

feature request

labels

2026-02-28 06:07:29 -05:00

deekerman commented

2026-02-28 06:07:33 -05:00

Author

Owner

@kalle07 commented on GitHub (May 1, 2025):

Add option to clear received snippets after every answer
Add option to embedd one document in two diffferent chunk-sizes for fast switch eg between short and long snippets
Add option to clear chached doc, without delete and load again
Add option, all options, and all above in level-workspace ... not global

;)

@kalle07 commented on GitHub (May 1, 2025): - Add option to clear received snippets after every answer - Add option to embedd one document in two diffferent chunk-sizes for fast switch eg between short and long snippets - Add option to clear chached doc, without delete and load again - Add option, all options, and all above in level-workspace ... not global ;)

deekerman commented

2026-02-28 06:07:34 -05:00

Author

Owner

@ROlwig commented on GitHub (May 9, 2025):

I realize this may not be as easy as I thought.

From one of our data science / devs who has done a lot of RAG work. I hope this helps:

What can be easily done

Showing page numbers

Page numbers can be added during PDF-to-text conversion and stored as metadata
This lets us display the document title and page alongside each retrieved snippet

Scrolling to the right snippet and clickable citations for panel view

We can tag each snippet with a unique ID or anchor (e.g., Doc1_Page3). When a citation is clicked, the viewer panel can scroll to and highlight that section

Performance of the viewer panel and UX

Only minimal citation info (e.g., title, page number, short preview) is shown initially
Full snippets load dynamically when a user clicks or expands them

Handling poor-quality or scanned PDFs

Structured PDFs with readable text are easy to process for page info and chunking

What can be a challenge

Showing page numbers

Many PDFs don’t preserve clean structure; text chunks may not map neatly to a single page
Simple chunking (e.g., every 500 tokens) can split content across pages, making page tracking unreliable

Scrolling to the right snippet and clickable citations for panel view

A generated response may contain several citations; user might click one and expect the viewer to jump without disrupting others
If citations are from different pages, scroll behavior may feel erratic

Performance of the viewer panel and UX
Showing long excerpts for every message can slow down the UI and cause lag

Payload size becomes a concern as more citations are loaded inline

Handling poor-quality or scanned PDFs

Scanned image PDFs or poorly formatted files can’t be parsed properly
Without clean/machine-readable text, we can’t extract citations or page numbers

What can be engineered/custom built

Showing page numbers

Use PDF parsers that preserve page breaks (e.g., PyMuPDF)
Add page markers like <<PAGE: 3>> into text before embedding
Ensure chunking aligns with actual document structure, not just token size

Scrolling to the right snippet and clickable citations for panel view

Build a scroll manager and state tracker that keeps the right viewer pane in sync with chat
Use expandable blocks (accordion or tabs) in the right panel to manage multiple citations clearly

Performance of the viewer panel and UX

Use lazy-loading logic in the front end
Load full citation content on-demand (when clicked), not upfront
Render only a short preview unless expanded

Handling poor-quality or scanned PDFs
-Add OCR-based fallback (e.g., using Tesseract) for image-based PDFs

Validate PDF structure during upload and flag unusable files early

@ROlwig commented on GitHub (May 9, 2025): I realize this may not be as easy as I thought. From one of our data science / devs who has done a lot of RAG work. I hope this helps: **What can be easily done** 1. Showing page numbers - Page numbers can be added during PDF-to-text conversion and stored as metadata - This lets us display the document title and page alongside each retrieved snippet 2. Scrolling to the right snippet and clickable citations for panel view - We can tag each snippet with a unique ID or anchor (e.g., Doc1_Page3). When a citation is clicked, the viewer panel can scroll to and highlight that section 3. Performance of the viewer panel and UX - Only minimal citation info (e.g., title, page number, short preview) is shown initially - Full snippets load dynamically when a user clicks or expands them 4. Handling poor-quality or scanned PDFs - Structured PDFs with readable text are easy to process for page info and chunking **What can be a challenge** 1. Showing page numbers - Many PDFs don’t preserve clean structure; text chunks may not map neatly to a single page - Simple chunking (e.g., every 500 tokens) can split content across pages, making page tracking unreliable 2. Scrolling to the right snippet and clickable citations for panel view - A generated response may contain several citations; user might click one and expect the viewer to jump without disrupting others - If citations are from different pages, scroll behavior may feel erratic 3. Performance of the viewer panel and UX Showing long excerpts for every message can slow down the UI and cause lag - Payload size becomes a concern as more citations are loaded inline 4. Handling poor-quality or scanned PDFs - Scanned image PDFs or poorly formatted files can’t be parsed properly - Without clean/machine-readable text, we can’t extract citations or page numbers **What can be engineered/custom built** 1. Showing page numbers - Use PDF parsers that preserve page breaks (e.g., PyMuPDF) - Add page markers like <<PAGE: 3>> into text before embedding - Ensure chunking aligns with actual document structure, not just token size 2. Scrolling to the right snippet and clickable citations for panel view - Build a scroll manager and state tracker that keeps the right viewer pane in sync with chat - Use expandable blocks (accordion or tabs) in the right panel to manage multiple citations clearly 3. Performance of the viewer panel and UX - Use lazy-loading logic in the front end - Load full citation content on-demand (when clicked), not upfront - Render only a short preview unless expanded 4. Handling poor-quality or scanned PDFs -Add OCR-based fallback (e.g., using Tesseract) for image-based PDFs - Validate PDF structure during upload and flag unusable files early

deekerman commented

2026-02-28 06:07:34 -05:00

Author

Owner

@kalle07 commented on GitHub (May 9, 2025):

@ROlwig - i have 2 option to parse you are on discord?

@kalle07 commented on GitHub (May 9, 2025): @ROlwig - i have 2 option to parse you are on discord?

deekerman commented

2026-02-28 06:07:34 -05:00

Author

Owner

@ROlwig commented on GitHub (Jun 7, 2025):

yes...thank you @kalle07.

@ROlwig commented on GitHub (Jun 7, 2025): yes...thank you @kalle07.

deekerman commented

2026-02-28 06:07:34 -05:00

Author

Owner

@kalle07 commented on GitHub (Jun 7, 2025):

@ROlwig go here to check out if you like ;)'
https://huggingface.co/kalle07/pdf2txt_parser_converter

@kalle07 commented on GitHub (Jun 7, 2025): @ROlwig go here to check out if you like ;)' https://huggingface.co/kalle07/pdf2txt_parser_converter

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/anything-llm#2426

No description provided.

Rows
Columns