Get a list of YouTube videos of a person
The YouTube API is the way to go to get a list of YouTube videos of a person, but it can cost you a lot.

A few days ago I got the question: Can you download all the public YouTube videos of a person, that were uploaded between 2020 and today. The total number of videos was about two hundred. And no, I could not get access to this person's YouTube account.
In this post, I use the YouTube API to download the required metadata from the videos, one item per video. I looked in PyPI, but could not find a suitable package for this trivial problem, so I decided to write some code myself. You can find the code below.
All it does is retrieve data from the YouTube API and store it in an 'items file'. That's all. You can use this file to create, for example, a file with lines where each line contains a yt-dlp command that downloads the video. But that's up to you.
As always I am doing this on Ubuntu 22.04.
yt-dlp and yt-dlp-gui
yt-dlp is a command-line program that can be used to download files from many sources, including YouTube. We can install this and then also install yt-dlp-gui, which gives us a GUI.
This is how I downloaded a number of files. Go to YouTube, copy the links and paste them into yt-dlp-gui. But we don't want to copy-paste 200 video urls, which is the opposite of DRY (Don't Repeat Yourself)!
YouTube API
To automate the download, we need to retrieve the metadata of all the person's video files. We can retrieve these using the YouTube API. Looking at this API, the way to go appeared to be the "YouTube - Data API - Search" method, see links below.
Getting a YouTube API key
To use the YouTube API you need a YouTube API key. I am not going to bore you with this here. There are many instructions on the internet how to get this API key.
The person has no YouTube channel?
To use the YouTube API search method, we need the channelId, which is the id of the person's YouTube channel. But what if the person hasn't created a channel? Then there is still a channelId. One way to find the channelId for an account is to search the Internet:
youtube channel <name>
This will give a link containing the channelId.
The YouTube API search method
I must get the meta data for some two hundred videos over a period of three years. However, the number of items returned by the YouTube API search method is limited to 50, the documentation is not very clear about this. Fortunately the YouTube API search method allows us to search between dates:
- published_after
- published_before
I decided to split the search into monthly searches and then hope that the person did not upload more than - some limit - videos a month.
The YouTube API does not return all items at once but uses paging. The response contains a parameter 'nextPageToken' if there are more items. In this case, we add this to our next request, receive the response, etc. until this parameter is NULL.
Here is an example of a response:
{
"kind": "youtube#searchListResponse",
"etag": "Hlc-6V55ICoxEujG5nA274peA0o",
"nextPageToken": "CAUQAA",
"regionCode": "NL",
"pageInfo": {
"totalResults": 29,
"resultsPerPage": 5
},
"items": [
...
]
}
And the items look look this:
[
{
'kind': 'youtube#searchResult',
'etag': <etag>,
'id': {
'kind': 'youtube#video',
'videoId': <video id>
},
'snippet': {
'publishedAt': '2023-07-12T01:55:21Z',
'channelId': <channel id>,
'title': <title>,
'description': <description>,
'thumbnails': {
...
},
'channelTitle': <channel title>,
...
}
},
...
]
Filename complications
This has nothing to do with the data retrieved from the YouTube API. I came across this when using this data to download video files with yt-dlp and wanted to share this with you.
A typical yt-dlp command to download a YouTube video into an mp4 is:
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" https://www.youtube.com/watch?v=lRlbcxXnMpQ -o "%(title)s.%(ext)s"
Here we let yt-dlp create the filename based on the title of the video. But depending on the operating system you use, not all characters are allowed in a filename.
To make it look as much like the video title as possible, yt-dlp converts certain characters to unicode, resulting in the filename looking almost like the title of the video, but often it is NOT the same! This is a nice feature, but totally unusable if you want to:
- compare filenames, or
- transfer files between different systems
In the end, I chose to create the filenames myself by replacing unwanted characters with an underscore. In addition, I created a text file with lines containing:
<filename> <video title>
In this way, it is possible to reconstruct the title of a filename starting from a filename. Note that it is even better to include a unique value in the filename, such as the published date and time, to avoid name clashes.
YouTube API credits: finished ... :-(
When you start using these APIs, you get free credits from Google so you can get started. Many people on the Internet already warned that the YouTube API search method consumed a lot of credits.
I did some short experiments and three full runs. During the third full run, my credits ran out. That's fast! Or, that's a lot of money or few simple runs!
Anyway, during the second run I had already collected all the data I wanted, so it was enough for this project. And don't panic, the free credits reset every day.
The code
If you want to try it yourself, here is the code, enter the person's channelId and your YouTube API key. We start at the most recent month, retrieve items, save items and move to the previous month until we get to the last month. We specify start and last as Tuples.
The 'items file' is loaded with the JSON items retrieved from the YouTube API. Items are added only for new videoIds. This means there is no need to delete this file between runs.
Install these first:
pip install python-dateutil
pip install requests
The code:
# get_video_list.py
import calendar
import datetime
import json
import logging
import os
import sys
import time
import urllib.parse
from dateutil import parser
from dateutil.relativedelta import relativedelta
import requests
def get_logger(
console_log_level=logging.DEBUG,
file_log_level=logging.DEBUG,
log_file=os.path.splitext(__file__)[0] + '.log',
):
logger_format = '%(asctime)s %(levelname)s [%(filename)-30s%(funcName)30s():%(lineno)03s] %(message)s'
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
if console_log_level:
# console
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(console_log_level)
console_handler.setFormatter(logging.Formatter(logger_format))
logger.addHandler(console_handler)
if file_log_level:
# file
file_handler = logging.FileHandler(log_file)
file_handler.setLevel(file_log_level)
file_handler.setFormatter(logging.Formatter(logger_format))
logger.addHandler(file_handler)
return logger
logger = get_logger()
class YouTubeUtils:
def __init__(
self,
logger=None,
channel_id=None,
api_key=None,
yyyy_mm_start=None,
yyyy_mm_last=None,
items_file=None,
):
self.logger = logger
self.channel_id = channel_id
self.api_key = api_key
self.items_file = items_file
# create empty items file if not exists
if not os.path.exists(self.items_file):
items = []
json_data = json.dumps(items)
with open(self.items_file, 'w') as fo:
fo.write(json_data)
self.year_month_dt_start = datetime.datetime(yyyy_mm_start[0], yyyy_mm_start[1], 1, 0, 0, 0)
self.year_month_dt_current = self.year_month_dt_start
self.year_month_dt_last = datetime.datetime(yyyy_mm_last[0], yyyy_mm_last[1], 1, 0, 0, 0)
# api request
self.request_delay = 3
self.request_timeout = 6
def get_previous_year_month(self):
self.logger.debug(f'()')
self.year_month_dt_current -= relativedelta(months=1)
self.logger.debug(f'year_month_dt_current = {self.year_month_dt_current}')
if self.year_month_dt_current < self.year_month_dt_last:
return None, None
yyyy = self.year_month_dt_current.year
m = self.year_month_dt_current.month
return yyyy, m
def get_published_between(self, yyyy, m):
last_day = calendar.monthrange(yyyy, m)[1]
published_after = f'{yyyy}-{m:02}-01T00:00:00Z'
published_before = f'{yyyy}-{m:02}-{last_day:02}T23:59:59Z'
self.logger.debug(f'published_after = {published_after}, published_before = {published_before}')
return published_after, published_before
def get_data_from_youtube_api(self, url):
self.logger.debug(f'(url = {url})')
r = None
try:
r = requests.get(url, timeout=self.request_timeout)
except Exception as e:
self.logger.exception(f'url = {url}')
raise
self.logger.debug(f'status_code = {r.status_code}')
if r.status_code != 200:
raise Exception(f'url = {url}, status_code = {r.status_code}, r = {r.__dict__}')
try:
data = r.json()
self.logger.debug(f'data = {data}')
except Exception as e:
raise Exception(f'url = {url}, converting json, status_code = {r.status_code}, r = {r.__dict__}')
raise
return data
def add_items_to_items_file(self, items_to_add):
self.logger.debug(f'(items_to_add = {items_to_add})')
# read file + json to dict
with open(self.items_file, 'r') as fo:
json_data = fo.read()
items = json.loads(json_data)
self.logger.debug(f'items = {items}')
# add only unique video_ids
video_ids = []
for item in items:
id = item.get('id')
if id is None:
continue
video_id = id.get('videoId')
if video_id is None:
continue
video_ids.append(video_id)
self.logger.debug(f'video_ids = {video_ids}')
items_added_count = 0
for item_to_add in items_to_add:
self.logger.debug(f'item_to_add = {item_to_add})')
kind_to_add = item_to_add['id']['kind']
if kind_to_add != 'youtube#video':
self.logger.debug(f'skipping kind_to_add = {kind_to_add})')
continue
video_id_to_add = item_to_add['id']['videoId']
if video_id_to_add not in video_ids:
self.logger.debug(f'adding video_id_to_add = {video_id_to_add})')
items.append(item_to_add)
items_added_count += 1
video_ids.append(video_id_to_add)
self.logger.debug(f'items_added_count = {items_added_count})')
if items_added_count > 0:
# dict to json + write file
json_data = json.dumps(items)
with open(self.items_file, 'w') as fo:
fo.write(json_data)
return items_added_count
def fetch_year_month_videos(self, yyyy, m):
self.logger.debug(f'(yyyy = {yyyy}, m = {m})')
published_after, published_before = self.get_published_between(yyyy, m)
url_base = 'https://youtube.googleapis.com/youtube/v3/search?'
url_params = {
'part': 'snippet,id',
'channelId': self.channel_id,
'publishedAfter': published_after,
'publishedBefore': published_before,
'sort': 'date',
'key': self.api_key,
}
url = url_base + urllib.parse.urlencode(url_params)
total_items_added_count = 0
while True:
time.sleep(self.request_delay)
data = self.get_data_from_youtube_api(url)
page_info = data.get('pageInfo')
self.logger.debug(f'page_info = {page_info})')
items = data.get('items')
if items is None:
break
if not isinstance(items, list) or len(items) == 0:
break
# add items
total_items_added_count += self.add_items_to_items_file(items)
next_page_token = data.get('nextPageToken')
self.logger.debug(f'next_page_token = {next_page_token})')
if next_page_token is None:
break
# add next page token
url_params['pageToken'] = next_page_token
url = url_base + urllib.parse.urlencode(url_params)
self.logger.debug(f'total_items_added_count = {total_items_added_count})')
return total_items_added_count
def main():
# replace CHANNEL_ID and API_KEY with your values
yt_utils = YouTubeUtils(
logger=logger,
channel_id='CHANNEL_ID',
api_key='API_KEY',
# current month + 1
yyyy_mm_start=(2023, 10),
yyyy_mm_last=(2020, 1),
items_file='./items.json',
)
while True:
yyyy, m = yt_utils.get_previous_year_month()
if yyyy is None or m is None:
break
logger.debug(f'fetching for {yyyy}-{m:02}')
yt_utils.fetch_year_month_videos(yyyy, m)
if __name__ == '__main__':
main()
Summary
Retrieving data about the YouTube videos using the YouTube API is not very difficult. We created a separate 'items file' to use for further processing.
This was a fun project with a big surprise once I started downloading YouTube videos using the information in the 'items file'. yt-dlp can generate file names that are very close to the title of the video. It does this by inserting unicode characters and it looks great, but I found this very confusing.
Oh, and another surprise. Using the YouTube API can be very expensive. It didn't take much to use up the daily free credits.
Links / credits
YouTube - Data API - Search
https://developers.google.com/youtube/v3/docs/search
yt-dlp
https://github.com/yt-dlp/yt-dlp
yt-dlp-gui and others
https://www.reddit.com/r/youtubedl/wiki/info-guis
Read more
Web automation YouTube
Most viewed
- Using PyInstaller and Cython to create a Python executable
- Reducing page response times of a Flask SQLAlchemy website
- Using Python's pyOpenSSL to verify SSL certificates downloaded from a host
- Connect to a service on a Docker host from a Docker container
- Using UUIDs instead of Integer Autoincrement Primary Keys with SQLAlchemy and MariaDb
- SQLAlchemy: Using Cascade Deletes to delete related objects