[FEAT]: Scraping of Authenticated Pages #1925

Open
opened 2026-02-28 05:45:57 -05:00 by deekerman · 9 comments
Owner

Originally created by @XarHD on GitHub (Jan 17, 2025).

How are you running AnythingLLM?

Docker (local)

What happened?

I tried to use the Bulk Site Scraper tool to scrape my wikipedia on Localhost. I was able to direct the scraper to the correct address, but no matter how many child levels I specify, or how many pages I set as a scraping limit, the results are the same. For the last experiment I asked to scrape 3 child levels and stop at 120 pages; the results popup told me it had successfully scraped 81 pages, but when I went to the list of documents to embed in the workspace, there's always only four of them: a document that refers to the main address; one that refers to the main page of the wiki (http://localhost/wiki/index.php/Main_Page); and two that refer to two sub-pages that appear on the side menu. The main page has more than just 2 subpages on the side menu, as well as at least three subpages on the main portion of the page.

Is this an issue, or am I doing something wrong with the scraper?

Are there known steps to reproduce?

No response

Originally created by @XarHD on GitHub (Jan 17, 2025). ### How are you running AnythingLLM? Docker (local) ### What happened? I tried to use the Bulk Site Scraper tool to scrape my wikipedia on Localhost. I was able to direct the scraper to the correct address, but no matter how many child levels I specify, or how many pages I set as a scraping limit, the results are the same. For the last experiment I asked to scrape 3 child levels and stop at 120 pages; the results popup told me it had successfully scraped 81 pages, but when I went to the list of documents to embed in the workspace, there's always only four of them: a document that refers to the main address; one that refers to the main page of the wiki (http://localhost/wiki/index.php/Main_Page); and two that refer to two sub-pages that appear on the side menu. The main page has more than just 2 subpages on the side menu, as well as at least three subpages on the main portion of the page. Is this an issue, or am I doing something wrong with the scraper? ### Are there known steps to reproduce? _No response_
Author
Owner

@timothycarambat commented on GitHub (Jan 17, 2025):

The collector logs would should exactly what links were found - what does it say?

@timothycarambat commented on GitHub (Jan 17, 2025): The collector logs would should exactly what links were found - what does it say?
Author
Owner

@XarHD commented on GitHub (Jan 17, 2025):

With the caveat that I made a mistake (this is the Desktop version of AnythingLLM, not the Docker version - apologies), I went into the storage/logs folder but couldn't find any collector log material (or any other logs, for that matter) created or modified today, even though I ran the Bulk Link Scraper today. I did find a file labeled LOG in the Session Storage folder, updated today, but its content just reads:

2025/01/17-23:05:01.827 5578 Reusing MANIFEST C:\Users<user>\AppData\Roaming\anythingllm-desktop\Session Storage/MANIFEST-000001
2025/01/17-23:05:01.828 5578 Recovering log #4
2025/01/17-23:05:01.828 5578 Reusing old log C:\Users<user>\AppData\Roaming\anythingllm-desktop\Session Storage/000004.log

The file it refers to, 000004.log, was last updated yesterday evening and is a series of text like this:

9BhœN   @namespace-4efc86ec_7e72_4466_802e_15615c8ed102-http://localhost/]åqc    
next-map-id253@namespace-7395dce8_55ae_48d0_bd00_ef261a1f7c6a-http://localhost/252:ôFDN 
  @namespace-7395dce8_55ae_48d0_bd00_ef261a1f7c6a-http://localhost/ «F,c    
next-map-id254@namespace-5de4a4d2_6fa5_4380_946b_67070e61618b-http://localhost/253¡á¡N   @namespace-5de4a4d2_6fa5_4380_946b_67070e61618b-http://localhost//iæc   
next-map-id255@namespace-71b7ebf2_0040_4f71_b16a_df59baa4b3d8-

going on for several more lines.

Inside AnythingLLM, the Event Logs do not include logs for the link scraper - it reports "workspace created", followed by "workspace_thread_created" and then "workspace_documents_added" (when I tried to add the four documents it seemed to have scraped, to see what they contained), but the link scraper activity (which occurred between the creation of the workspace and the adding of documents) isn't recorded.

Is there anywhere else I should check? Sorry, I have only been using AnythingLLM for a few days.

For context, I am a writer and my goal would be to input all the background information on my book and on the world it's set into, so that I can chat with the LLM to pull out any information I need or identify potential inconsistencies. I can upload single documents - but all the material I have fits into 7000 pages of a MediaWiki installation, so it's very time-consuming to transfer everything to documents. I thought of creating a read-only account for the SQL database, but my installation of MediaWiki via XAMPP seems to have issues with the control panel of the SQL database, so while I can use the wiki just fine, I cannot at this time create new users to access the SQL database, hence why I would need an alternate way to scrape the info out of the wiki and into AnythingLLM.

Thank you!

@XarHD commented on GitHub (Jan 17, 2025): With the caveat that I made a mistake (this is the Desktop version of AnythingLLM, not the Docker version - apologies), I went into the storage/logs folder but couldn't find any collector log material (or any other logs, for that matter) created or modified today, even though I ran the Bulk Link Scraper today. I did find a file labeled LOG in the Session Storage folder, updated today, but its content just reads: > 2025/01/17-23:05:01.827 5578 Reusing MANIFEST C:\Users\<user>\AppData\Roaming\anythingllm-desktop\Session Storage/MANIFEST-000001 > 2025/01/17-23:05:01.828 5578 Recovering log #4 > 2025/01/17-23:05:01.828 5578 Reusing old log C:\Users\<user>\AppData\Roaming\anythingllm-desktop\Session Storage/000004.log The file it refers to, 000004.log, was last updated yesterday evening and is a series of text like this: > 9BhœN   @namespace-4efc86ec_7e72_4466_802e_15615c8ed102-http://localhost/]åqc     > next-map-id253@namespace-7395dce8_55ae_48d0_bd00_ef261a1f7c6a-http://localhost/252:ôFDN  >   @namespace-7395dce8_55ae_48d0_bd00_ef261a1f7c6a-http://localhost/ «F,c     > next-map-id254@namespace-5de4a4d2_6fa5_4380_946b_67070e61618b-http://localhost/253¡á¡N   @namespace-5de4a4d2_6fa5_4380_946b_67070e61618b-http://localhost//iæc    > next-map-id255@namespace-71b7ebf2_0040_4f71_b16a_df59baa4b3d8- going on for several more lines. Inside AnythingLLM, the Event Logs do not include logs for the link scraper - it reports "workspace created", followed by "workspace_thread_created" and then "workspace_documents_added" (when I tried to add the four documents it seemed to have scraped, to see what they contained), but the link scraper activity (which occurred between the creation of the workspace and the adding of documents) isn't recorded. Is there anywhere else I should check? Sorry, I have only been using AnythingLLM for a few days. For context, I am a writer and my goal would be to input all the background information on my book and on the world it's set into, so that I can chat with the LLM to pull out any information I need or identify potential inconsistencies. I can upload single documents - but all the material I have fits into 7000 pages of a MediaWiki installation, so it's very time-consuming to transfer everything to documents. I thought of creating a read-only account for the SQL database, but my installation of MediaWiki via XAMPP seems to have issues with the control panel of the SQL database, so while I can use the wiki just fine, I cannot at this time create new users to access the SQL database, hence why I would need an alternate way to scrape the info out of the wiki and into AnythingLLM. Thank you!
Author
Owner

@timothycarambat commented on GitHub (Jan 17, 2025):

You should see logs by running the app in debug mode
https://docs.anythingllm.com/installation-desktop/debug#anythingllm-debug-mode-on-linux

But you can also checkout your storage/logs folder for a collector-DATE.log file.
/Users/<usr>/.config/anythingllm-desktop/storage/

@timothycarambat commented on GitHub (Jan 17, 2025): You should see logs by running the app in debug mode https://docs.anythingllm.com/installation-desktop/debug#anythingllm-debug-mode-on-linux But you can also checkout your storage/logs folder for a `collector-DATE.log` file. `/Users/<usr>/.config/anythingllm-desktop/storage/`
Author
Owner

@XarHD commented on GitHub (Jan 18, 2025):

I found the collector log (for some reason it didn't appear in the log folder yesterday when I checked, but it's there today). Here's the report:

{"level":"info","message":"Collector hot directory and tmp storage wiped!","service":"collector"}
{"level":"info","message":"[production] AnythingLLM Standalone Document processor listening on port 8888.","service":"collector"}
{"level":"info","message":"Discovering links...","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Found 81 links to scrape.","service":"collector"}
{"level":"info","message":"Starting bulk scraping...","service":"collector"}
{"level":"info","message":"Scraping 1/81: http://localhost/theworld/index.php/Main_Page","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Main_Page.","service":"collector"}
{"level":"info","message":"Scraping 2/81: http://localhost/theworld/index.php/Main_Page#mw-head","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Main_Page#mw-head.","service":"collector"}
{"level":"info","message":"Scraping 3/81: http://localhost/theworld/index.php/Main_Page#searchInput","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Main_Page#searchInput.","service":"collector"}
{"level":"info","message":"Scraping 4/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 5/81: http://localhost/theworld/index.php/Special:Badtitle","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:Badtitle.","service":"collector"}
{"level":"info","message":"Scraping 6/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page.","service":"collector"}
{"level":"info","message":"Scraping 7/81: http://localhost/theworld/index.php/Special:RecentChanges","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:RecentChanges.","service":"collector"}
{"level":"info","message":"Scraping 8/81: http://localhost/theworld/index.php/Special:Random","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:Random.","service":"collector"}
{"level":"info","message":"Scraping 9/81: http://localhost/theworld/index.php/Category:Waning","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Waning.","service":"collector"}
{"level":"info","message":"Scraping 10/81: http://localhost/theworld/index.php/Category:Waning_Characters","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Waning_Characters.","service":"collector"}
{"level":"info","message":"Scraping 11/81: http://localhost/theworld/index.php/Category:1_-_The_World","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:1_-_The_World.","service":"collector"}
{"level":"info","message":"Scraping 12/81: http://localhost/theworld/index.php/Category:Kalaratri","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Kalaratri.","service":"collector"}
{"level":"info","message":"Scraping 13/81: http://localhost/theworld/index.php/Category:Tumulus","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Tumulus.","service":"collector"}
{"level":"info","message":"Scraping 14/81: http://localhost/theworld/index.php/Category:Midnight_Faire","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Midnight_Faire.","service":"collector"}
{"level":"info","message":"Scraping 15/81: http://localhost/theworld/index.php/Category:Otherworlds","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Otherworlds.","service":"collector"}
{"level":"info","message":"Scraping 16/81: http://localhost/theworld/index.php/Category:Root","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Root.","service":"collector"}
{"level":"info","message":"Scraping 17/81: http://localhost/theworld/index.php/Category:Frosthold","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Frosthold.","service":"collector"}
{"level":"info","message":"Scraping 18/81: http://localhost/theworld/index.php/Naming_Conventions","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Naming_Conventions.","service":"collector"}
{"level":"info","message":"Scraping 19/81: http://localhost/theworld/index.php/Location_Adjectives","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Location_Adjectives.","service":"collector"}
{"level":"info","message":"Scraping 20/81: http://localhost/theworld/index.php/Category:Writing","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Writing.","service":"collector"}
{"level":"info","message":"Scraping 21/81: http://localhost/theworld/index.php/Category:Inspiration","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Inspiration.","service":"collector"}
{"level":"info","message":"Scraping 22/81: http://localhost/theworld/index.php/Category:Thousand_Worlds","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Thousand_Worlds.","service":"collector"}
{"level":"info","message":"Scraping 23/81: http://localhost/theworld/index.php/Category:Worlds","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Worlds.","service":"collector"}
{"level":"info","message":"Scraping 24/81: http://localhost/theworld/index.php/Category:World_Templates","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:World_Templates.","service":"collector"}
{"level":"info","message":"Scraping 25/81: http://localhost/theworld/index.php/Category:ZZ_Pathfinder_2e","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:ZZ_Pathfinder_2e.","service":"collector"}
{"level":"info","message":"Scraping 26/81: http://localhost/theworld/index.php/Category:ZZ_Generic_Characters","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:ZZ_Generic_Characters.","service":"collector"}
{"level":"info","message":"Scraping 27/81: http://localhost/theworld/index.php/Special:SpecialPages","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:SpecialPages.","service":"collector"}
{"level":"info","message":"Scraping 28/81: http://localhost/theworld/index.php/The_World:Privacy_policy","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/The_World:Privacy_policy.","service":"collector"}
{"level":"info","message":"Scraping 29/81: http://localhost/theworld/index.php/The_World:About","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/The_World:About.","service":"collector"}
{"level":"info","message":"Scraping 30/81: http://localhost/theworld/index.php/The_World:General_disclaimer","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/The_World:General_disclaimer.","service":"collector"}
{"level":"info","message":"Scraping 31/81: http://localhost/theworld/index.php/Special:UserLogin","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:UserLogin.","service":"collector"}
{"level":"info","message":"Scraping 32/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ABadtitle","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ABadtitle.","service":"collector"}
{"level":"info","message":"Scraping 33/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 34/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges.","service":"collector"}
{"level":"info","message":"Scraping 35/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 36/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom.","service":"collector"}
{"level":"info","message":"Scraping 37/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 38/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic.","service":"collector"}
{"level":"info","message":"Scraping 39/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 40/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters.","service":"collector"}
{"level":"info","message":"Scraping 41/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 42/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World.","service":"collector"}
{"level":"info","message":"Scraping 43/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 44/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri.","service":"collector"}
{"level":"info","message":"Scraping 45/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 46/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus.","service":"collector"}
{"level":"info","message":"Scraping 47/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 48/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire.","service":"collector"}
{"level":"info","message":"Scraping 49/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 50/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds.","service":"collector"}
{"level":"info","message":"Scraping 51/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 52/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot.","service":"collector"}
{"level":"info","message":"Scraping 53/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 54/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold.","service":"collector"}
{"level":"info","message":"Scraping 55/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 56/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions.","service":"collector"}
{"level":"info","message":"Scraping 57/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 58/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives.","service":"collector"}
{"level":"info","message":"Scraping 59/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 60/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting.","service":"collector"}
{"level":"info","message":"Scraping 61/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 62/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration.","service":"collector"}
{"level":"info","message":"Scraping 63/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 64/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World.","service":"collector"}
{"level":"info","message":"Scraping 65/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 66/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds.","service":"collector"}
{"level":"info","message":"Scraping 67/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 68/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates.","service":"collector"}
{"level":"info","message":"Scraping 69/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 70/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e.","service":"collector"}
{"level":"info","message":"Scraping 71/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 72/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters.","service":"collector"}
{"level":"info","message":"Scraping 73/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 74/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages.","service":"collector"}
{"level":"info","message":"Scraping 75/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 76/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy.","service":"collector"}
{"level":"info","message":"Scraping 77/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 78/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout.","service":"collector"}
{"level":"info","message":"Scraping 79/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer&returntoquery=","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer&returntoquery=.","service":"collector"}
{"level":"info","message":"Scraping 80/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer.","service":"collector"}
{"level":"info","message":"Scraping 81/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3AUserLogin","service":"collector"}
{"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"}
{"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3AUserLogin.","service":"collector"}
{"level":"info","message":"Scraped 81 pages.","service":"collector"}

The end result on the frontend is that only four pages appear in the document picker, with the following names:

localhost_theworld_index.php.html
localhost_theworld_index.php_Location_Adjectives.html
localhost_theworld_index.php_Main_Page.html
localhost_theworld_index.php_Naming_Conventions.html

@XarHD commented on GitHub (Jan 18, 2025): I found the collector log (for some reason it didn't appear in the log folder yesterday when I checked, but it's there today). Here's the report: ``` {"level":"info","message":"Collector hot directory and tmp storage wiped!","service":"collector"} {"level":"info","message":"[production] AnythingLLM Standalone Document processor listening on port 8888.","service":"collector"} {"level":"info","message":"Discovering links...","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Found 81 links to scrape.","service":"collector"} {"level":"info","message":"Starting bulk scraping...","service":"collector"} {"level":"info","message":"Scraping 1/81: http://localhost/theworld/index.php/Main_Page","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Main_Page.","service":"collector"} {"level":"info","message":"Scraping 2/81: http://localhost/theworld/index.php/Main_Page#mw-head","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Main_Page#mw-head.","service":"collector"} {"level":"info","message":"Scraping 3/81: http://localhost/theworld/index.php/Main_Page#searchInput","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Main_Page#searchInput.","service":"collector"} {"level":"info","message":"Scraping 4/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 5/81: http://localhost/theworld/index.php/Special:Badtitle","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:Badtitle.","service":"collector"} {"level":"info","message":"Scraping 6/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Main+Page.","service":"collector"} {"level":"info","message":"Scraping 7/81: http://localhost/theworld/index.php/Special:RecentChanges","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:RecentChanges.","service":"collector"} {"level":"info","message":"Scraping 8/81: http://localhost/theworld/index.php/Special:Random","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:Random.","service":"collector"} {"level":"info","message":"Scraping 9/81: http://localhost/theworld/index.php/Category:Waning","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Waning.","service":"collector"} {"level":"info","message":"Scraping 10/81: http://localhost/theworld/index.php/Category:Waning_Characters","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Waning_Characters.","service":"collector"} {"level":"info","message":"Scraping 11/81: http://localhost/theworld/index.php/Category:1_-_The_World","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:1_-_The_World.","service":"collector"} {"level":"info","message":"Scraping 12/81: http://localhost/theworld/index.php/Category:Kalaratri","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Kalaratri.","service":"collector"} {"level":"info","message":"Scraping 13/81: http://localhost/theworld/index.php/Category:Tumulus","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Tumulus.","service":"collector"} {"level":"info","message":"Scraping 14/81: http://localhost/theworld/index.php/Category:Midnight_Faire","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Midnight_Faire.","service":"collector"} {"level":"info","message":"Scraping 15/81: http://localhost/theworld/index.php/Category:Otherworlds","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Otherworlds.","service":"collector"} {"level":"info","message":"Scraping 16/81: http://localhost/theworld/index.php/Category:Root","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Root.","service":"collector"} {"level":"info","message":"Scraping 17/81: http://localhost/theworld/index.php/Category:Frosthold","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Frosthold.","service":"collector"} {"level":"info","message":"Scraping 18/81: http://localhost/theworld/index.php/Naming_Conventions","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Naming_Conventions.","service":"collector"} {"level":"info","message":"Scraping 19/81: http://localhost/theworld/index.php/Location_Adjectives","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Location_Adjectives.","service":"collector"} {"level":"info","message":"Scraping 20/81: http://localhost/theworld/index.php/Category:Writing","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Writing.","service":"collector"} {"level":"info","message":"Scraping 21/81: http://localhost/theworld/index.php/Category:Inspiration","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Inspiration.","service":"collector"} {"level":"info","message":"Scraping 22/81: http://localhost/theworld/index.php/Category:Thousand_Worlds","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Thousand_Worlds.","service":"collector"} {"level":"info","message":"Scraping 23/81: http://localhost/theworld/index.php/Category:Worlds","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:Worlds.","service":"collector"} {"level":"info","message":"Scraping 24/81: http://localhost/theworld/index.php/Category:World_Templates","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:World_Templates.","service":"collector"} {"level":"info","message":"Scraping 25/81: http://localhost/theworld/index.php/Category:ZZ_Pathfinder_2e","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:ZZ_Pathfinder_2e.","service":"collector"} {"level":"info","message":"Scraping 26/81: http://localhost/theworld/index.php/Category:ZZ_Generic_Characters","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Category:ZZ_Generic_Characters.","service":"collector"} {"level":"info","message":"Scraping 27/81: http://localhost/theworld/index.php/Special:SpecialPages","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:SpecialPages.","service":"collector"} {"level":"info","message":"Scraping 28/81: http://localhost/theworld/index.php/The_World:Privacy_policy","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/The_World:Privacy_policy.","service":"collector"} {"level":"info","message":"Scraping 29/81: http://localhost/theworld/index.php/The_World:About","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/The_World:About.","service":"collector"} {"level":"info","message":"Scraping 30/81: http://localhost/theworld/index.php/The_World:General_disclaimer","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/The_World:General_disclaimer.","service":"collector"} {"level":"info","message":"Scraping 31/81: http://localhost/theworld/index.php/Special:UserLogin","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php/Special:UserLogin.","service":"collector"} {"level":"info","message":"Scraping 32/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ABadtitle","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ABadtitle.","service":"collector"} {"level":"info","message":"Scraping 33/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 34/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARecentChanges.","service":"collector"} {"level":"info","message":"Scraping 35/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 36/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ARandom.","service":"collector"} {"level":"info","message":"Scraping 37/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 38/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic.","service":"collector"} {"level":"info","message":"Scraping 39/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 40/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+Waning+Days+of+Magic+Characters.","service":"collector"} {"level":"info","message":"Scraping 41/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 42/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3A1+-+The+World.","service":"collector"} {"level":"info","message":"Scraping 43/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 44/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AKalaratri.","service":"collector"} {"level":"info","message":"Scraping 45/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 46/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ATumulus.","service":"collector"} {"level":"info","message":"Scraping 47/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 48/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Faire.","service":"collector"} {"level":"info","message":"Scraping 49/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 50/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AOtherworlds.","service":"collector"} {"level":"info","message":"Scraping 51/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 52/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3ARoot.","service":"collector"} {"level":"info","message":"Scraping 53/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 54/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AFrosthold.","service":"collector"} {"level":"info","message":"Scraping 55/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 56/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Naming+Conventions.","service":"collector"} {"level":"info","message":"Scraping 57/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 58/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Location+Adjectives.","service":"collector"} {"level":"info","message":"Scraping 59/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 60/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWriting.","service":"collector"} {"level":"info","message":"Scraping 61/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 62/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AInspiration.","service":"collector"} {"level":"info","message":"Scraping 63/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 64/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AThe+World.","service":"collector"} {"level":"info","message":"Scraping 65/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 66/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorlds.","service":"collector"} {"level":"info","message":"Scraping 67/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 68/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates.","service":"collector"} {"level":"info","message":"Scraping 69/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 70/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Pathfinder+2e.","service":"collector"} {"level":"info","message":"Scraping 71/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 72/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AZZ+Generic+Characters.","service":"collector"} {"level":"info","message":"Scraping 73/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 74/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3ASpecialPages.","service":"collector"} {"level":"info","message":"Scraping 75/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 76/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3APrivacy+policy.","service":"collector"} {"level":"info","message":"Scraping 77/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 78/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AAbout.","service":"collector"} {"level":"info","message":"Scraping 79/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer&returntoquery=","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer&returntoquery=.","service":"collector"} {"level":"info","message":"Scraping 80/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=The+World%3AGeneral+disclaimer.","service":"collector"} {"level":"info","message":"Scraping 81/81: http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3AUserLogin","service":"collector"} {"level":"info","message":"Cleaning up request handler for request ID.","service":"collector"} {"level":"info","message":"Successfully scraped http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Special%3AUserLogin.","service":"collector"} {"level":"info","message":"Scraped 81 pages.","service":"collector"} ``` The end result on the frontend is that only four pages appear in the document picker, with the following names: > localhost_theworld_index.php.html > localhost_theworld_index.php_Location_Adjectives.html > localhost_theworld_index.php_Main_Page.html > localhost_theworld_index.php_Naming_Conventions.html
Author
Owner

@timothycarambat commented on GitHub (Jan 18, 2025):

http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates

Is this an authenticated service? Seems like a login page or something?

@timothycarambat commented on GitHub (Jan 18, 2025): ` http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates` Is this an authenticated service? Seems like a login page or something?
Author
Owner

@XarHD commented on GitHub (Jan 19, 2025):

It's a private MediaWiki instance run via XAMPP. It does have a login page,
although my computer has a saved cookie so I'm not required to login every
time. I assumed the scraper would have the same access from the same
computer, but perhaps not?

On Sun, Jan 19, 2025, 02:03 Timothy Carambat @.***>
wrote:

http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates

Is this an authenticated service? Seems like a login page or something?


Reply to this email directly, view it on GitHub
https://github.com/Mintplex-Labs/anything-llm/issues/2984#issuecomment-2600374541,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/A7PN5EBR75SRRMGWCX7YTMT2LL2VTAVCNFSM6AAAAABVMLVN5SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBQGM3TINJUGE
.
You are receiving this because you authored the thread.Message ID:
@.***>

@XarHD commented on GitHub (Jan 19, 2025): It's a private MediaWiki instance run via XAMPP. It does have a login page, although my computer has a saved cookie so I'm not required to login every time. I assumed the scraper would have the same access from the same computer, but perhaps not? On Sun, Jan 19, 2025, 02:03 Timothy Carambat ***@***.***> wrote: > > http://localhost/theworld/index.php?title=Special:UserLogin&returnto=Category%3AWorld+Templates > > Is this an authenticated service? Seems like a login page or something? > > — > Reply to this email directly, view it on GitHub > <https://github.com/Mintplex-Labs/anything-llm/issues/2984#issuecomment-2600374541>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/A7PN5EBR75SRRMGWCX7YTMT2LL2VTAVCNFSM6AAAAABVMLVN5SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBQGM3TINJUGE> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >
Author
Owner

@timothycarambat commented on GitHub (Jan 20, 2025):

I assumed the scraper would have the same access from the same
computer, but perhaps not?

Correct, we do not borrow or highjack your current browser session (for obvious reasons) - however private web scraping is for sure something we can enable so that all web-scraping from the desktop client function does have authentication to your protected pages.

On docker however, this is more complex since it would be difficult to enable the session sharing since it would require the user to specify some kind of chrome session data location - which is vastly more complex.

Our current solution for Docker users to scrape protected pages is via the Chrome Extension that can connect to your instance.

@timothycarambat commented on GitHub (Jan 20, 2025): > I assumed the scraper would have the same access from the same computer, but perhaps not? Correct, we do not borrow or highjack your current browser session (for obvious reasons) - however private web scraping is for sure something we _can_ enable so that all web-scraping from the desktop client function does have authentication to your protected pages. On docker however, this is more complex since it would be difficult to enable the session sharing since it would require the user to specify some kind of chrome session data location - which is vastly more complex. Our current solution for Docker users to scrape protected pages is via the [Chrome Extension](https://chromewebstore.google.com/detail/anythingllm-browser-compa/pncmdlebcopjodenlllcomedphdmeogm?hl=en&pli=1) that can connect to your instance.
Author
Owner

@XarHD commented on GitHub (Jan 20, 2025):

Thank you. Is it something that can be done with the current version of the
desktop AnythingLLM? If so, how?

Il giorno lun 20 gen 2025 alle ore 18:16 Timothy Carambat <
@.***> ha scritto:

I assumed the scraper would have the same access from the same
computer, but perhaps not?

Correct, we do not borrow or highjack your current browser session (for
obvious reasons) - however private web scraping is for sure something we
can enable so that all web-scraping from the desktop client function
does have authentication to your protected pages.


Reply to this email directly, view it on GitHub
https://github.com/Mintplex-Labs/anything-llm/issues/2984#issuecomment-2602934295,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/A7PN5EATEO6KFRTASILIWQT2LUVO3AVCNFSM6AAAAABVMLVN5SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBSHEZTIMRZGU
.
You are receiving this because you authored the thread.Message ID:
@.***>

@XarHD commented on GitHub (Jan 20, 2025): Thank you. Is it something that can be done with the current version of the desktop AnythingLLM? If so, how? Il giorno lun 20 gen 2025 alle ore 18:16 Timothy Carambat < ***@***.***> ha scritto: > I assumed the scraper would have the same access from the same > computer, but perhaps not? > > Correct, we do not borrow or highjack your current browser session (for > obvious reasons) - however private web scraping is for sure something we > *can* enable so that all web-scraping from the desktop client function > does have authentication to your protected pages. > > — > Reply to this email directly, view it on GitHub > <https://github.com/Mintplex-Labs/anything-llm/issues/2984#issuecomment-2602934295>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/A7PN5EATEO6KFRTASILIWQT2LUVO3AVCNFSM6AAAAABVMLVN5SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMBSHEZTIMRZGU> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >
Author
Owner

@timothycarambat commented on GitHub (Jan 20, 2025):

@XarHD No, which is why I renamed the issue to be a feature. it is something I know we can accommodate, but its not live right now.

@timothycarambat commented on GitHub (Jan 20, 2025): @XarHD No, which is why I renamed the issue to be a feature. it is something I know we can accommodate, but its not live right now.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/anything-llm#1925
No description provided.