French accents not handled correctly in --write-info-json output #277

Closed
opened 2026-02-20 21:07:46 -05:00 by deekerman · 1 comment
Owner

Originally created by @Strolls on GitHub (May 21, 2012).

Compare with --get-description:

$ youtube-dl --skip-download --get-description http://www.youtube.com/watch?v=7hrTwMr2xSk
Valse de Jo Privat Sa Préférée. www.myspace.com/clementreboul
$ youtube-dl --skip-download --write-info-json http://www.youtube.com/watch?v=7hrTwMr2xSk
[youtube] Setting language
[youtube] 7hrTwMr2xSk: Downloading video webpage
[youtube] 7hrTwMr2xSk: Downloading video info webpage
[youtube] 7hrTwMr2xSk: Extracting video information
[info] Video description metadata as JSON to: 7hrTwMr2xSk.flv.info.json
$ cut -f 2 -d ',' 7hrTwMr2xSk.flv.info.json 
 "description": "Valse de Jo Privat Sa Pr\u00e9f\u00e9r\u00e9e. www.myspace.com/clementreboul"
$

Note "préférée" vs. "pr\u00e9f\u00e9r\u00e9e"

Originally created by @Strolls on GitHub (May 21, 2012). Compare with --get-description: ``` $ youtube-dl --skip-download --get-description http://www.youtube.com/watch?v=7hrTwMr2xSk Valse de Jo Privat Sa Préférée. www.myspace.com/clementreboul $ youtube-dl --skip-download --write-info-json http://www.youtube.com/watch?v=7hrTwMr2xSk [youtube] Setting language [youtube] 7hrTwMr2xSk: Downloading video webpage [youtube] 7hrTwMr2xSk: Downloading video info webpage [youtube] 7hrTwMr2xSk: Extracting video information [info] Video description metadata as JSON to: 7hrTwMr2xSk.flv.info.json $ cut -f 2 -d ',' 7hrTwMr2xSk.flv.info.json "description": "Valse de Jo Privat Sa Pr\u00e9f\u00e9r\u00e9e. www.myspace.com/clementreboul" $ ``` Note "préférée" vs. "pr\u00e9f\u00e9r\u00e9e"
Author
Owner

@phihag commented on GitHub (May 22, 2012):

This is completely correct, see RFC 4627 2.5, second paragraph. To parse JSON, use a proper JSON parser. If you just want the description, use --write-description.

While we could encode non-ASCII characters as UTF-8 in our JSON output, I fear doing so would raise other issues.

@phihag commented on GitHub (May 22, 2012): This is completely correct, see [RFC 4627 2.5, second paragraph](http://tools.ietf.org/html/rfc4627#section-2.5). To parse JSON, use a proper JSON parser. If you just want the description, use `--write-description`. While we could encode non-ASCII characters as UTF-8 in our JSON output, I fear doing so would raise other issues.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/youtube-dl#277
No description provided.