mirror of
https://github.com/Mintplex-Labs/anything-llm.git
synced 2026-03-02 22:57:05 -05:00
Workspaces include "non-embedded" files #56
Labels
No labels
Desktop
Docker
Integration Request
Integration Request
OS: Linux
OS: Mobile
OS: Windows
UI/UX
blocked
bug
bug
core-team-only
documentation
duplicate
embed-widget
enhancement
feature request
github_actions
good first issue
investigating
needs info / can't replicate
possible bug
question
stage: specifications
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/anything-llm#56
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @danglingptr0x0 on GitHub (Jun 19, 2023).
I have noticed that one of my workspaces cites files that are not embedded in it. I double-checked its embedded files, restarted both the server and the UI, and retried, but it happened anyway. The files in question have been vectorized, but they are not in the workspace, so maybe it somehow reads all vectorized, but not necessarily embedded files?
@timothycarambat commented on GitHub (Jun 19, 2023):
The embeddings are separated by
namespaceorcollectionsper each workspace. So if using Pinecone for example it will only do a search across documents/embeddings in that specific namespace.Are you saying you have embedding documents that are appearing in other workspaces?
@danglingptr0x0 commented on GitHub (Jun 20, 2023):
Precisely, yes.
@timothycarambat commented on GitHub (Jun 26, 2023):
When this problem occurs is it referencing source documents in the reply? It may be returning general information as if you just asked it a question in ChatGPT.
It must return the source documents it got information from so that would determine if that if it is using information from other workspaces.
@danglingptr0x0 commented on GitHub (Jun 27, 2023):
@timothycarambat It's referencing docs that are not embedded in the given workspace.
@timothycarambat commented on GitHub (Jun 28, 2023):
Then even doubly so, which documents is it referring to that you are sure are definitely not embedded? It may be possible the document for sure is embedded by the vector db but somehow the db does not have an associating document record in the DB - which would be weird but not impossible.
Collections are namespace and should not be colliding and are separated logically that way on the vector DB level
@danglingptr0x0 commented on GitHub (Jul 5, 2023):
@timothycarambat Hm, they are vectorized, but not embedded in the given workspace, according to the doc selector modal. Yet, they are being referenced in the chat. So what you said would, I guess, be the opposite of what I am seeing. Docs are being referenced but not even selected for the given workspace in the first place. I can screencap what I'm seeing if it persists after updating. Haven't interacted with it for a week or two.
@timothycarambat commented on GitHub (Nov 14, 2023):
Marking this issue as stale, we have revised permissions and other document-related features so please try to replicate again
@danglingptr0x0 commented on GitHub (Nov 14, 2023):
Will do.