Request update thisav.com #26703

Open
opened 2026-02-21 14:26:45 -05:00 by deekerman · 9 comments
Owner

Originally created by @WalterShomer on GitHub (Oct 12, 2023).

The link ( nsfw )
https://thisav.com/ja/juq-380

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2021.12.17
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

-Version log ( not sure why it wont update )

C:\Users\homebase>youtube-dl -U
ERROR: can't find the current version. Please try again later.

C:\Users\homebase>youtube-dl --version
2021.12.17

Verbose log

C:\Users\homebase>youtube-dl https://thisav.com/ja/juq-380 -o "D:\2 - Programs\youtube-dl" --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://thisav.com/ja/juq-380', '-o', 'D:\\2 - Programs\\youtube-dl', '--verbose']
[debug] Encodings: locale cp932, fs mbcs, out cp932, pref cp932
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.19041
[debug] exe versions: ffmpeg 2023-04-26-git-e3143703e9-full_build-www.gyan.dev, ffprobe 2023-04-26-git-e3143703e9-full_build-www.gyan.dev
[debug] Proxy map: {}
[generic] juq-380: Requesting header
WARNING: Could not send HEAD request to https://thisav.com/ja/juq-380: HTTP Error 403: Forbidden
[generic] juq-380: Downloading webpage
ERROR: Unable to download webpage: HTTP Error 403: Forbidden (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 634, in _request_webpage
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\YoutubeDL.py", line 2288, in urlopen
  File "C:\Python\Python34\lib\urllib\request.py", line 470, in open
  File "C:\Python\Python34\lib\urllib\request.py", line 580, in http_response
  File "C:\Python\Python34\lib\urllib\request.py", line 508, in error
  File "C:\Python\Python34\lib\urllib\request.py", line 442, in _call_chain
  File "C:\Python\Python34\lib\urllib\request.py", line 588, in http_error_default

Description

Downloader doesn't work, i've seen that it supports thisjav.com, currently can't find the file but i remeber seeing that it was written thisjav.com/video/name but now the site points to thisjav.com/name

also, running the youtube-dl.exe on path

Originally created by @WalterShomer on GitHub (Oct 12, 2023). The link **( nsfw )** https://thisav.com/ja/juq-380 - [x] I'm reporting a broken site support - [x] I've verified that I'm running youtube-dl version **2021.12.17** - [x] I've checked that all provided URLs are alive and playable in a browser - [x] I've checked that all URLs and arguments with special characters are properly quoted or escaped - [x] I've searched the bugtracker for similar issues including closed ones -Version log ( not sure why it wont update ) ``` C:\Users\homebase>youtube-dl -U ERROR: can't find the current version. Please try again later. C:\Users\homebase>youtube-dl --version 2021.12.17 ``` ## Verbose log ``` C:\Users\homebase>youtube-dl https://thisav.com/ja/juq-380 -o "D:\2 - Programs\youtube-dl" --verbose [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['https://thisav.com/ja/juq-380', '-o', 'D:\\2 - Programs\\youtube-dl', '--verbose'] [debug] Encodings: locale cp932, fs mbcs, out cp932, pref cp932 [debug] youtube-dl version 2021.12.17 [debug] Python version 3.4.4 (CPython) - Windows-10-10.0.19041 [debug] exe versions: ffmpeg 2023-04-26-git-e3143703e9-full_build-www.gyan.dev, ffprobe 2023-04-26-git-e3143703e9-full_build-www.gyan.dev [debug] Proxy map: {} [generic] juq-380: Requesting header WARNING: Could not send HEAD request to https://thisav.com/ja/juq-380: HTTP Error 403: Forbidden [generic] juq-380: Downloading webpage ERROR: Unable to download webpage: HTTP Error 403: Forbidden (caused by HTTPError()); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dl -U to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\extractor\common.py", line 634, in _request_webpage File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpupik7c6w\build\youtube_dl\YoutubeDL.py", line 2288, in urlopen File "C:\Python\Python34\lib\urllib\request.py", line 470, in open File "C:\Python\Python34\lib\urllib\request.py", line 580, in http_response File "C:\Python\Python34\lib\urllib\request.py", line 508, in error File "C:\Python\Python34\lib\urllib\request.py", line 442, in _call_chain File "C:\Python\Python34\lib\urllib\request.py", line 588, in http_error_default ``` ## Description Downloader doesn't work, i've seen that it supports thisjav.com, currently can't find the file but i remeber seeing that it was written thisjav.com/video/name but now the site points to thisjav.com/name also, running the youtube-dl.exe on path
Author
Owner

@dirkf commented on GitHub (Oct 13, 2023):

Various issues are preventing the archival of important Asian babe content.

  1. The expected URL pattern has /videos/... while the site now uses /{lang}/... with an optional 2-3 character language component. However old-style URLs may still be valid.
  2. Once the extractor sees the page, it can't find any video link using the existing tactics. The link is obfuscated in an eval(function (p,a,c,k,e,d){..}) JS block. extractor/xfileshare.py knows how to decode this, and the decoded URL in a test page could be retrieved when using the page URL as Referer header.
  3. The DASH MPD manifest retrieved seems to be degenerate, in that no segment URLs are provided. Further investigation, ideally by someone who is a DASH expert, unlike me, is needed.
@dirkf commented on GitHub (Oct 13, 2023): Various issues are preventing the archival of important Asian babe content. 1. The expected URL pattern has `/videos/...` while the site now uses `/{lang}/...` with an optional 2-3 character language component. However old-style URLs may still be valid. 2. Once the extractor sees the page, it can't find any video link using the existing tactics. The link is obfuscated in an `eval(function (p,a,c,k,e,d){..})` JS block. `extractor/xfileshare.py` knows how to decode this, and the decoded URL in a test page could be retrieved when using the page URL as `Referer` header. 3. The DASH MPD manifest retrieved seems to be degenerate, in that no segment URLs are provided. Further investigation, ideally by someone who is a DASH expert, unlike me, is needed.
Author
Owner

@dirkf commented on GitHub (Oct 14, 2023):

Well, I've tweaked the DASH format extraction so that the old test video 2 in the extractor works, though I might well have broken some or all other DASH extraction in doing so. With 1 and 2 above as well:

$ python -m youtube_dl -v -F 'https://thisav.com/ja/juq-380'
[debug] System config: [u'--prefer-ffmpeg']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'-F', u'https://thisav.com/ja/juq-380']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: 66ab0814c
[debug] Python 2.7.18 (CPython i686 32bit) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial - OpenSSL 1.1.1w  11 Sep 2023 - glibc 2.15
[debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3
[debug] Proxy map: {}
[ThisAV] juq-380: Downloading webpage
[ThisAV] juq-380: Extracting from obfuscated HTML5
[ThisAV] juq-380: Downloading m3u8 information
WARNING: unable to extract uploader name; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[info] Available formats for juq-380:
format code  extension  resolution note
800          mp4        640x360     800k 
1400         mp4        842x480    1400k 
2800         mp4        1280x720   2800k  (best)
$ 

I was able to view the video using format 800. This video uses HLS instead of DASH and doesn't need ffmpeg. The new page format (even for old URLs) doesn't seem to include any uploader info.

@dirkf commented on GitHub (Oct 14, 2023): Well, I've tweaked the DASH format extraction so that the old test video 2 in the extractor works, though I might well have broken some or all other DASH extraction in doing so. With 1 and 2 above as well: ```console $ python -m youtube_dl -v -F 'https://thisav.com/ja/juq-380' [debug] System config: [u'--prefer-ffmpeg'] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: [u'-v', u'-F', u'https://thisav.com/ja/juq-380'] [debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8 [debug] youtube-dl version 2021.12.17 [debug] Git HEAD: 66ab0814c [debug] Python 2.7.18 (CPython i686 32bit) - Linux-4.4.0-210-generic-i686-with-Ubuntu-16.04-xenial - OpenSSL 1.1.1w 11 Sep 2023 - glibc 2.15 [debug] exe versions: avconv 4.3, avprobe 4.3, ffmpeg 4.3, ffprobe 4.3 [debug] Proxy map: {} [ThisAV] juq-380: Downloading webpage [ThisAV] juq-380: Extracting from obfuscated HTML5 [ThisAV] juq-380: Downloading m3u8 information WARNING: unable to extract uploader name; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. [info] Available formats for juq-380: format code extension resolution note 800 mp4 640x360 800k 1400 mp4 842x480 1400k 2800 mp4 1280x720 2800k (best) $ ``` I was able to view the video using format 800. This video uses HLS instead of DASH and doesn't need _ffmpeg_. The new page format (even for old URLs) doesn't seem to include any uploader info.
Author
Owner

@WalterShomer commented on GitHub (Oct 14, 2023):

Various issues are preventing the archival of important Asian babe content.

1. The expected URL pattern has `/videos/...` while the site now uses `/{lang}/...` with an optional 2-3 character language component. However old-style URLs may still be valid.

2. Once the extractor sees the page, it can't find any video link using the existing tactics. The link is obfuscated in an `eval(function (p,a,c,k,e,d){..})` JS block. `extractor/xfileshare.py` knows how to decode this, and the decoded URL in a test page could be retrieved when using the page URL as `Referer` header.

3. The DASH MPD manifest retrieved seems to be degenerate, in that no segment URLs are provided. Further investigation, ideally by someone who is a DASH expert, unlike me, is needed.

Hi, thanks for the quick replay and the attention.

could you please eleborate on what have you changed in point 1? i've tried some small editing in thisav.py but doesn't seem to got anywhere.

also for point 2 would referer something like this be enough? python -m youtube_dl -v -F --add-header referer 'http://xvideosharing.com/fq65f94nd2ve' 'https://thisav.com/ja/juq-380'
or --referer 'http://xvideosharing.com/fq65f94nd2ve

@WalterShomer commented on GitHub (Oct 14, 2023): > Various issues are preventing the archival of important Asian babe content. > > 1. The expected URL pattern has `/videos/...` while the site now uses `/{lang}/...` with an optional 2-3 character language component. However old-style URLs may still be valid. > > 2. Once the extractor sees the page, it can't find any video link using the existing tactics. The link is obfuscated in an `eval(function (p,a,c,k,e,d){..})` JS block. `extractor/xfileshare.py` knows how to decode this, and the decoded URL in a test page could be retrieved when using the page URL as `Referer` header. > > 3. The DASH MPD manifest retrieved seems to be degenerate, in that no segment URLs are provided. Further investigation, ideally by someone who is a DASH expert, unlike me, is needed. Hi, thanks for the quick replay and the attention. could you please eleborate on what have you changed in point 1? i've tried some small editing in thisav.py but doesn't seem to got anywhere. also for point 2 would referer something like this be enough? `python -m youtube_dl -v -F --add-header referer 'http://xvideosharing.com/fq65f94nd2ve' 'https://thisav.com/ja/juq-380' ` or `--referer 'http://xvideosharing.com/fq65f94nd2ve`
Author
Owner

@dirkf commented on GitHub (Oct 14, 2023):

Really the changes are too extensive to publish as a patch. A PR will be needed.

@dirkf commented on GitHub (Oct 14, 2023): Really the changes are too extensive to publish as a patch. A PR will be needed.
Author
Owner

@longsack commented on GitHub (Oct 15, 2023):

I'm following this but not familiar with much, just learning. What is a PR (public release?)

@longsack commented on GitHub (Oct 15, 2023): I'm following this but not familiar with much, just learning. What is a PR (public release?)
Author
Owner

@dirkf commented on GitHub (Oct 15, 2023):

https://docs.github.com/articles/about-pull-requests

@dirkf commented on GitHub (Oct 15, 2023): https://docs.github.com/articles/about-pull-requests
Author
Owner

@longsack commented on GitHub (Oct 16, 2023):

Thanks @dirkf I will keep my eye on this issue, really interested in this site.

@longsack commented on GitHub (Oct 16, 2023): Thanks @dirkf I will keep my eye on this issue, really interested in this site.
Author
Owner

@dirkf commented on GitHub (Jan 24, 2024):

... I might well have broken some or all other DASH extraction in doing so. ...

The specification for the resolution of BaseURL is in ISO/IEC 23009-1 section 5.6 which in turn references RFC 3986 (the worm has certainly turned there).

@dirkf commented on GitHub (Jan 24, 2024): >... I might well have broken some or all other DASH extraction in doing so. ... The specification for the resolution of `BaseURL` is in [ISO/IEC 23009-1 section 5.6](https://ossrs.io/lts/zh-cn/assets/files/ISO_IEC_23009-1-DASH-2012-0e19daf31a9902aed8755e868cc39113.pdf) which in turn references [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986#section-5) (the worm has certainly turned there).
Author
Owner

@dirkf commented on GitHub (Jan 27, 2024):

My modified extractor code still succeeds as above (ie, no uploader is found, but otherwise OK), with PR #32710.

@dirkf commented on GitHub (Jan 27, 2024): My modified extractor code still succeeds as [above](https://github.com/ytdl-org/youtube-dl/issues/32595#issuecomment-1762630843) (ie, no `uploader` is found, but otherwise OK), with PR #32710.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/youtube-dl-ytdl-org#26703
No description provided.