mirror of
https://github.com/Mintplex-Labs/anything-llm.git
synced 2026-03-02 22:57:05 -05:00
[BUG]: File Parsing Fails for URLs Without Explicit File Extensions #2897
Labels
No labels
Desktop
Docker
Integration Request
Integration Request
OS: Linux
OS: Mobile
OS: Windows
UI/UX
blocked
bug
bug
core-team-only
documentation
duplicate
embed-widget
enhancement
feature request
github_actions
good first issue
investigating
needs info / can't replicate
possible bug
question
stage: specifications
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/anything-llm#2897
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @angelplusultra on GitHub (Oct 8, 2025).
Originally assigned to: @timothycarambat on GitHub.
How are you running AnythingLLM?
All versions
What happened?
When attempting to pull and parse a file using either the RAG Modal or
@agentmode, the process fails if the URL does not explicitly end with a file extension (e.g.,.pdf,.csv). This occurs even when the server responds with a correctContent-Typeheader that identifies the file type.Example:
The following URL fails to be processed, despite responding with an
application/pdfcontent type:https://arxiv.org/pdf/2307.10265
Observed Behavior:
The application logs display the following error:
This error originates from the file extension guard located at:
github.com/Mintplex-Labs/anything-llm@89a01492b5/collector/processSingleFile/index.js (L58-L72)Expected Behavior:
The system should be able to successfully pull and parse files from URLs that do not explicitly contain a file extension, provided the
Content-Typeheader in the server's response clearly indicates the file's MIME type.Are there known steps to reproduce?
No response
@Guru6163 commented on GitHub (Oct 8, 2025):
I would like to work on this @timothycarambat