Youtube metadata for videos#

Hint

Consult Metadata evaluation for an explanation of the evaluation system included in the column Int. Data.

1CATLISM, 255-262

Data points from youtube-dl for videos1CATLISM, 255-262#

Table 5.28 #

Descriptions are adapted from youtube-dl source code.

Attribute name

Type

Int. Data

Description

_filename

string

PID

Full name of the video file

abr

number

PID

Average audio bitrate in KBit/s

acodec

string

PID

Name of the audio codec in use

age_limit

number

SID + PID

Age restriction for the video, as an integer (years) (optional)

alt_title

string

SID

A secondary title of the video. (optional)

average_rating

number

SID

Average rating give by users, the scale used depends on the webpage (optional)

categories.*

array

[SID + PID]

An array of strings, each one describing a category that the video falls in, for example [“Sports”, “Berlin”] as assigned by the Youtube system and the content’s creator (optional)

channel

string

SID

Full name of the channel the video is uploaded on. Note that channel fields may or may not repeat uploader fields. This depends on a particular extractor. (optional)

channel_id

string

PID

Id of the channel. (optional)

channel_url

string

PID

Full URL to a channel webpage. (optional)

chapters.*

array

[SID]

An array of dictionaries containing the start and end time of each video chapter, as defined by the content creator

chapters.0.end_time

number

SID

The end time of the chapter in seconds (optional)

chapters.0.start_time

number

SID

The start time of the chapter in seconds (optional)

chapters.0.title

string

SID

The title of the chapter

comment_count

number

SID

Number of comments on the video (optional)

creator

string

SID

The creator of the video. (optional)

description

string

SID

Full video description. (optional)

dislike_count

number

SID

Number of negative ratings of the video

display_id

string

PID

An alternative identifier for the video, not necessarily unique, but available before title. Typically, id is something like “4234987”, title “Dancing naked mole rats”, and display_id “dancing-naked-mole-rats” (optional)

duration

number

SID + PID

Length of the video in seconds. (optional)

end_time

number

SID

Time in seconds where the reproduction should end, as specified in the URL. (optional)

ext

string

PID

Video filename extension.

extractor

string

CID

Label of the tool used to extract the data

extractor_key

string

CID

Unique ID of the operation conducted by the tool used to extract the data

format

string

PID

Textual description of the format of the content, e.g. ‘160 - 256x144 (144p)’ indicating the internal Youtube number assigned to the format, the size of the content (width and height in pixels) and the resolution of the format

formats.*

array

[PID]

A list of dictionaries for each format available, ordered from worst to best quality.

formats.0.abr

number

PID

Average audio bitrate in KBit/s

formats.0.acodec

string

PID

Name of the audio codec in use

formats.0.asr

number

PID

Audio sampling rate in Hertz

formats.0.container

string

PID

The name of the file container (e.g. mp4)

formats.0.downloader_options.http_chunk_size

number

PID

The number of parts in which the file is split when transmitted from the Youtube server to the local client

formats.0.ext

string

PID

The extension of the format

formats.0.filesize

number

PID

The number of bytes, if known in advance

formats.0.filesize_approx

number

PID

An estimate for the number of bytes

formats.0.format

string

PID

A human-readable description of the format (“mp4 container with h264/opus”). Calculated from the format_id, width, height. and format_note fields if missing.

formats.0.format_id

string

PID

A short description of the format (“mp4_h264_opus” or “19”). Technically optional, but strongly recommended.

formats.0.format_note

string

PID

Additional info about the format (“3D” or “DASH video”)

formats.0.fps

number

PID

Frame rate of the video

formats.0.fragment_base_url

string

PID

Base URL for fragments. Each fragment’s path value (if present) will be relative to this URL.

formats.0.fragments

string

PID

A list of fragments of a fragmented media. Each fragment entry must contain either an url or a path. If an url is present it should be considered by a client. Otherwise both path and fragment_base_url must be present.

formats.0.height

number

PID

Height of the video in pixels

formats.0.http_headers.*

array

[CID]

An array containing objects with additional HTTP headers (i.e. instructions) that were added to the request made for collecting the data

formats.0.http_headers.Accept

string

CID

Description of the formats requested to the server

formats.0.http_headers.Accept-Charset

string

CID

List of the character encodings requested to the server

formats.0.http_headers.Accept-Encoding

string

CID

List of the compression formats requested to the server

formats.0.http_headers.Accept-Language

string

CID

List of the languages (in two-letter codes, e.g. ‘en’) requested to the server

formats.0.http_headers.User-Agent

string

CID

The User-Agent (see ‘Crawling and scraping the data’) employed

formats.0.language

string

SID

Language code, e.g. “de” or “en-US” of the content, as defined by the creator

formats.0.language_preference

number

SID

The preferred language of the content to be shown to viewers, as set by the creator

formats.0.manifest_url

string

PID

The URL of the manifest file in case of fragmented media (DASH, hls, hds)

formats.0.no_resume

bool

PID

Whether the server supports download resuming

formats.0.player_url

string

PID

Link to the player URL, i.e. the web tool used to play the video

formats.0.preference

number

PID

Order number of this format. If this field is present and not None, the formats get sorted by this field, regardless of all other values. -1 for default (order by other properties), -2 or smaller for less than default. < -1000 to hide the format (if there is another one which is strictly better)

formats.0.protocol

string

CID

The protocol used for the actual download, lower-case. “http”, “https”, “rtsp”, “rtmp”, “rtmpe”, “m3u8”, “m3u8_native” or “http_dash_segments”.

formats.0.quality

number

PID

Order number of the video quality of this format, irrespective of the file format. -1 for default (order by other properties), -2 or smaller for less than default.

formats.0.resolution

string

PID

Textual description of width and height in pixels

formats.0.source_preference

number

CID

Order number for the selected video source (quality takes higher priority) -1 for default (order by other properties), -2 or smaller for less than default.

formats.0.stretched_ratio

number

PID

If given and not 1, indicates that the video’s pixels are not square.

formats.0.tbr

number

PID

Average bitrate of audio and video in KBit/s

formats.0.url

string

PID

The full URL of the video file

formats.0.vbr

number

PID

Average video bitrate in KBit/s

formats.0.vcodec

string

PID

The codec used to encode the video

formats.0.width

number

PID

Width of the video in pixels

fps

number

PID

The number of Frames Per Second of the video

fulltitle

string

SID

The full title of the content, as written by the creator

height

number

PID

Height of the video in pixels

id

string

PID

Unique video identifier; this is the code that appears in a Youtube URL

is_live

bool

SID

Whether this video is a live stream that goes on instead of a fixed-length video. (optional)

license

string

SID + PID

Licence name the video is licenced under. (optional)

like_count

number

SID

Number of positive ratings of the video (optional)

location

string

SID

Physical location where the video was filmed as set by the creator (optional)

playlist

string

SID

Name of the playlist the video is part of

playlist_id

string

PID

Unique ID of the playlist

playlist_index

number

SID + PID

Order number of this video in the playlist it belongs to

playlist_title

string

SID

Title of the playlist

playlist_uploader

string

SID

Name of account that uploaded the playlist

playlist_uploader_id

string

PID

Unique ID of the account that uploaded the playlist

release_date

string

SID

The date (YYYYMMDD) when the video was released. (optional)

repost_count

number

SID

Number of reposts of the video (optional)

resolution

string

PID

The video resolution, e.g. 144p

start_time

number

SID

Time in seconds where the reproduction should start, as specified in the URL. (optional)

stretched_ratio

bool

PID

Whether the video proportions should be kept intact or the video should be resized

subtitles.*

array

[SID]

The available subtitles as a dictionary in the format {tag: subformats}. “tag” is usually a language code, and “subformats” is a list sorted from lower to higher preference

subtitles.*.[LL].data

string

SID + PID

The subtitles file contents (optional), where [LL] is a two-letter label identifying the language using ISO 639-1 format

subtitles.*.[LL].ext

string

PID

The extension of the subtitle track format (e.g. SRV3), where [LL] is a two-letter label identifying the language using ISO 639-1 format

subtitles.*.[LL].url

string

PID

A URL pointing to the subtitles file (optional), where [LL] is a two-letter label identifying the language using ISO 639-1 format

tags.*

array

[SID]

A list of strings each one representing a tag assigned to the video, e.g. [“sweden”, “pop music”] by the creator (optional)

thumbnail

array

[PID]

Full URL to a video thumbnail image.

thumbnails.*

array

[PID]

An array of JSON objects containing details for the preview thumbnails

thumbnails.0.filesize

number

PID

The size of the thumbnail file in KB

thumbnails.0.height

number

PID

Height of the thumbnail in pixels

thumbnails.0.id

string

PID

Thumbnail format internal ID

thumbnails.0.preference

number

PID

Quality of the image using internal descriptions

thumbnails.0.resolution

string

PID

Resolution of the video in the format “{width}x{height”}, deprecated)

thumbnails.0.url

string

PID

Direct link to the thumbnail image

thumbnails.0.width

number

PID

Width of the thumbnail in pixels

timestamp

number

PID

UNIX timestamp of the moment the video became available. (optional)

title

string

SID

Video title as written by the creator

upload_date

string

SID + PID

Video upload date (YYYYMMDD). If not explicitly set, calculated from timestamp. (optional)

uploader

string

SID

Full name of the video uploader. (optional)

uploader_id

string

PID

Unique ID of the video uploader. (optional)

uploader_url

string

SID

Full URL to a personal webpage of the video uploader. (optional)

vbr

number

PID

Average video bitrate in KBit/s

vcodec

string

PID

Name of the video codec in use

view_count

number

AID

How many users have watched the video on the platform. (optional)

webpage_url

string

SID

The URL to the video webpage, if given to youtube-dl it should allow to get the same result again. (It will be set by YoutubeDL if it’s missing) (e.g. <http://www.ted.com/talks/dan_dennett_on_our_consciousness.html>

webpage_url_basename

string

SID

The name of the web page contained in the webpage_url data-point, e.g. <http://www.ted.com/talks/dan_dennett_on_our_consciousness.html>

width

number

PID

Width of the video in pixels