CLI option to DUMP [VideosPage/ChannelPage/Playlist] URL-Lists to text file using standalone/inclusive option #25372

Open
opened 2026-02-21 12:29:51 -05:00 by deekerman · 0 comments
Owner

Originally created by @unityconstruct on GitHub (Jul 19, 2022).

Checklist

  • I'm reporting a feature request
  • I've verified that I'm running youtube-dl version 2021.12.17
  • I've searched the bugtracker for similar feature requests including closed ones
    Search the config page & CLI help for any clues feature might already exist.

Description

Feature Request:

  • Would like to have a CLI option to DUMP these URLS to text file that can be used as standalone option or along with the other options.
  • This is helpful for incremental 'fetches' to video/channel pages where there are 900+ videos.

'Problem'

  • When scraping videos-page/channel, youtube-dl pulls ALL pages & apparently scrapes them to create a list of urls to download before iterating through them.
  • For each URL, youtube-dl appears to pull the url, parse metadata, then run analysis on date-range & archive-file-check (previously downloaded?).
  • The date-range assuredly helps in targeting content, but youtube-dl will still produce an obscene number of GET requests to the host.
  • A URL dump would allow end-user to fetch the url-list with only a few GETS ( one per pagination ) and then manually adjust the url list.

Use Case:

  • I have a shell script that reads text file and iteratively calls youtube-dl to fetch single vid
  • This allows me to aggregate videos when I see something interesting, then run a script to fetch them all at once.
  • Having youtube-dl dump ALL urls of a videos-page/channel to text/console ( & before iterating them ), would all me to both archive the full URL address for later use ( such as referencing in social media posts or online articles ) and tailor the list to reduce traffic to the 'videos host'.
  • Part of me winces when I see 900 GETS to YT as it just seems a little less then 'low key'. I can't help but think this type of traffic makes youtube-dl a target, where mechanisms start showing up to frustrate this type of scraper engine.
  • Seems this type of traffic would also flag VPN IPs much faster than otherwise might occur.

Syntax: dump urls AND PROCESS

--dump-urls "PATH"
--dump-urls "/media/media3/ytu/archive-urls.txt"
Scrape URLS, then dump to text file & continue processing

Syntax: dump urls ONLY AND DO NOT PROCESS

--dump-urls-only "PATH"
--dump-urls-only "/media/media3/ytu/archive-urls.txt"
Scrape URLS, then dump to text file & exit

CLI Session

NOT using BATCHFILE, using https://www.youtube.com/c/xxxxxxxx/videos
COMMAND STRING: /media/media3/ytu/youtube-dl -v --console-title --verbose -f best[height<=?1080]/bestvideo[height<=?1080]+bestaudio --merge-output-format mp4 --write-thumbnail --write-description --write-info-json --ignore-errors --max-downloads 50 --no-overwrites --restrict-filenames --download-archive /media/media3/ytu/archive-video.txt --no-playlist --output /home/uc/Videos/ytu/%(uploader)s/%(title)s-%(id)s.%(ext)s --dateafter 20200101 https://www.youtube.com/c/xxxxxxxx/videos
Timestamp:
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'--console-title', u'--verbose', u'-f', u'best[height<=?1080]/bestvideo[height<=?1080]+bestaudio', u'--merge-output-format', u'mp4', u'--write-thumbnail', u'--write-description', u'--write-info-json', u'--ignore-errors', u'--max-downloads', u'50', u'--no-overwrites', u'--restrict-filenames', u'--download-archive', u'/media/media3/ytu/archive-video.txt', u'--no-playlist', u'--output', u'/home/uc/Videos/ytu/%(uploader)s/%(title)s-%(id)s.%(ext)s', u'--dateafter', u'20200101', u'https://www.youtube.com/c/xxxxxxxx/videos']
[debug] Encodings: locale UTF-8, fs UTF-8, out None, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 2.7.17 (CPython) - Linux-4.15.0-20-generic-x86_64-with-LinuxMint-19.1-tessa
[debug] exe versions: ffmpeg 3.4.11, ffprobe 3.4.11
[debug] Proxy map: {}
[youtube:tab] DavidVose: Downloading webpage
[download] Downloading playlist: xxxxxxxx - Videos
[youtube:tab] Downloading page 1
[youtube:tab] Downloading page 2
[youtube:tab] Downloading page 3
[youtube:tab] Downloading page 4
[youtube:tab] Downloading page 5
[youtube:tab] Downloading page 6
[youtube:tab] Downloading page 7
[youtube:tab] Downloading page 8
[youtube:tab] Downloading page 9
[youtube:tab] Downloading page 10
[youtube:tab] Downloading page 11
[youtube:tab] Downloading page 12
[youtube:tab] Downloading page 13
[youtube:tab] Downloading page 14
[youtube:tab] Downloading page 15
[youtube:tab] Downloading page 16
[youtube:tab] Downloading page 17
[youtube:tab] Downloading page 18
[youtube:tab] Downloading page 19
[youtube:tab] Downloading page 20
[youtube:tab] Downloading page 21
[youtube:tab] Downloading page 22
[youtube:tab] Downloading page 23

< DUMP URLS HERE, BEFORE ITERATING THEM >

Example of [wince-stimuli]

[download] Downloading video 818 of 938
[youtube] YlKd4YysVMA: Downloading webpage
[youtube] Downloading just video YlKd4YysVMA because of --no-playlist
[youtube] YlKd4YysVMA: Downloading MPD manifest
[debug] [youtube] Decrypted nsig ugGYu0WVAK7v5c9d => cP9vgB1UX5u_Lg
[debug] [youtube] Decrypted nsig EPn10_rOqmaCOzKP => -iXaKXk1eQoNZg
[download] 2016-04-10 upload date is not in range 2020-01-01 - 9999-12-31
[download] Downloading video 819 of 938
[youtube] 7XiarrY3CXM: Downloading webpage
[youtube] Downloading just video 7XiarrY3CXM because of --no-playlist
[youtube] 7XiarrY3CXM: Downloading MPD manifest
[debug] [youtube] Decrypted nsig uNGKQi_VOQPwTewK => nw2ROBiUJuA2aA
[debug] [youtube] Decrypted nsig h8YlvZffzSK2qL8x => B2Crz0RmxeGukA
[download] 2016-03-24 upload date is not in range 2020-01-01 - 9999-12-31
[download] Downloading video 820 of 938

Originally created by @unityconstruct on GitHub (Jul 19, 2022). ## Checklist <!-- Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl: - First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2021.12.17. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED. - Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates. - Finally, put x into all relevant boxes (like this [x]) --> - [X] I'm reporting a feature request - [X] I've verified that I'm running youtube-dl version **2021.12.17** - [X] I've searched the bugtracker for similar feature requests including closed ones Search the config page & CLI help for any clues feature might already exist. ## Description ### Feature Request: - Would like to have a CLI option to DUMP these URLS to text file that can be used as standalone option or along with the other options. - This is helpful for incremental 'fetches' to video/channel pages where there are 900+ videos. ### 'Problem' - When scraping videos-page/channel, youtube-dl pulls ALL pages & apparently scrapes them to create a list of urls to download before iterating through them. - For each URL, youtube-dl appears to pull the url, parse metadata, then run analysis on date-range & archive-file-check (previously downloaded?). - The date-range assuredly helps in targeting content, but youtube-dl will still produce an obscene number of GET requests to the host. - A URL dump would allow end-user to fetch the url-list with only a few GETS ( one per pagination ) and then manually adjust the url list. ### Use Case: - I have a shell script that reads text file and iteratively calls youtube-dl to fetch single vid - This allows me to aggregate videos when I see something interesting, then run a script to fetch them all at once. - Having youtube-dl dump ALL urls of a videos-page/channel to text/console ( & before iterating them ), would all me to both archive the full URL address for later use ( such as referencing in social media posts or online articles ) and tailor the list to reduce traffic to the 'videos host'. - Part of me winces when I see 900 GETS to YT as it just seems a little less then 'low key'. I can't help but think this type of traffic makes youtube-dl a target, where mechanisms start showing up to frustrate this type of scraper engine. - Seems this type of traffic would also flag VPN IPs much faster than otherwise might occur. ### Syntax: dump urls AND PROCESS --dump-urls "PATH" --dump-urls "/media/media3/ytu/archive-urls.txt" Scrape URLS, then dump to text file & continue processing ### Syntax: dump urls ONLY AND DO NOT PROCESS --dump-urls-only "PATH" --dump-urls-only "/media/media3/ytu/archive-urls.txt" Scrape URLS, then dump to text file & exit ### CLI Session NOT using BATCHFILE, using https://www.youtube.com/c/xxxxxxxx/videos COMMAND STRING: /media/media3/ytu/youtube-dl -v --console-title --verbose -f best[height<=?1080]/bestvideo[height<=?1080]+bestaudio --merge-output-format mp4 --write-thumbnail --write-description --write-info-json --ignore-errors --max-downloads 50 --no-overwrites --restrict-filenames --download-archive /media/media3/ytu/archive-video.txt --no-playlist --output /home/uc/Videos/ytu/%(uploader)s/%(title)s-%(id)s.%(ext)s --dateafter 20200101 https://www.youtube.com/c/xxxxxxxx/videos Timestamp: [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: [u'-v', u'--console-title', u'--verbose', u'-f', u'best[height<=?1080]/bestvideo[height<=?1080]+bestaudio', u'--merge-output-format', u'mp4', u'--write-thumbnail', u'--write-description', u'--write-info-json', u'--ignore-errors', u'--max-downloads', u'50', u'--no-overwrites', u'--restrict-filenames', u'--download-archive', u'/media/media3/ytu/archive-video.txt', u'--no-playlist', u'--output', u'/home/uc/Videos/ytu/%(uploader)s/%(title)s-%(id)s.%(ext)s', u'--dateafter', u'20200101', u'https://www.youtube.com/c/xxxxxxxx/videos'] [debug] Encodings: locale UTF-8, fs UTF-8, out None, pref UTF-8 [debug] youtube-dl version 2021.12.17 [debug] Python version 2.7.17 (CPython) - Linux-4.15.0-20-generic-x86_64-with-LinuxMint-19.1-tessa [debug] exe versions: ffmpeg 3.4.11, ffprobe 3.4.11 [debug] Proxy map: {} [youtube:tab] DavidVose: Downloading webpage [download] Downloading playlist: xxxxxxxx - Videos [youtube:tab] Downloading page 1 [youtube:tab] Downloading page 2 [youtube:tab] Downloading page 3 [youtube:tab] Downloading page 4 [youtube:tab] Downloading page 5 [youtube:tab] Downloading page 6 [youtube:tab] Downloading page 7 [youtube:tab] Downloading page 8 [youtube:tab] Downloading page 9 [youtube:tab] Downloading page 10 [youtube:tab] Downloading page 11 [youtube:tab] Downloading page 12 [youtube:tab] Downloading page 13 [youtube:tab] Downloading page 14 [youtube:tab] Downloading page 15 [youtube:tab] Downloading page 16 [youtube:tab] Downloading page 17 [youtube:tab] Downloading page 18 [youtube:tab] Downloading page 19 [youtube:tab] Downloading page 20 [youtube:tab] Downloading page 21 [youtube:tab] Downloading page 22 [youtube:tab] Downloading page 23 < DUMP URLS HERE, BEFORE ITERATING THEM > ### Example of [wince-stimuli] [download] Downloading video __818 of 938__ [youtube] YlKd4YysVMA: Downloading webpage [youtube] Downloading just video YlKd4YysVMA because of --no-playlist [youtube] YlKd4YysVMA: Downloading MPD manifest [debug] [youtube] Decrypted nsig ugGYu0WVAK7v5c9d => cP9vgB1UX5u_Lg [debug] [youtube] Decrypted nsig EPn10_rOqmaCOzKP => -iXaKXk1eQoNZg [download] 2016-04-10 upload date is not in range 2020-01-01 - 9999-12-31 [download] Downloading video _819 of 938_ [youtube] 7XiarrY3CXM: Downloading webpage [youtube] Downloading just video 7XiarrY3CXM because of --no-playlist [youtube] 7XiarrY3CXM: Downloading MPD manifest [debug] [youtube] Decrypted nsig uNGKQi_VOQPwTewK => nw2ROBiUJuA2aA [debug] [youtube] Decrypted nsig h8YlvZffzSK2qL8x => B2Crz0RmxeGukA [download] 2016-03-24 upload date is not in range 2020-01-01 - 9999-12-31 [download] Downloading video _820 of 938_
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/youtube-dl#25372
No description provided.