Without a doubt, YouTube is the most popular video-sharing platform in the world. As a software developer, you may encounter a situation where you want to script something to download videos either in audio or video format. To achieve this, you can use pytube. At its core, pytube is a lightweight, dependency-free library written in Python. Not only does it include a command-line utility that allows you to download videos right from the terminal, but it also makes pipelining easy by allowing you to specify callback functions for different download events.
Prerequisites
- Python 3 installed on your local machine
- A text editor (e.g., Visual Studio Code, Atom, etc.)
- Fundamental knowledge of Python
Step 1: Create a Virtual Environment
When working on any Python project that involves the installation and use of third party dependency-free libraries, the first thing to do is create a virtual environment. A virtual environment is a Python environment where the libraries and scripts installed into it are isolated from those installed in other virtual environments. The virtual environment retains the libraries installed in the “system” Python by default. For example, the library installed as part of your operating system can allow you to create a “virtual” isolated Python installation, and you install packages into that virtual installation.
By virtue of creating a virtual environment, when you switch projects, you can simply create a new virtual environment and not have to worry about breaking the packages installed in the other environments. It is one of the most important tools used by Python developers. With a virtual environment, you will have full control over the libraries used in the project.
In this tutorial, you will use the pytube and Flask APIs.
To create a virtual environment, go to your project’s directory and run venv
:
For Unix/macOS
python3 -m venv env
For Windows
python -m venv env
Before you can start installing or using packages in your virtual environment, you’ll need to activate it. Activating a virtual environment will put the virtual environment-specific python
and pip
executables into your shell’s PATH
.
For Unix/macOS
source env/bin/activate
For Windows
.\env\Scripts\activate
Step 2: Install the Needed Libraries
To complete the environment setup, you’ll need to install Flask
and pytube
. You install these by creating a requirements.txt
file. The requirements.txt
file is a text file where you list the libraries required for your application. It is the convention typically used by developers that makes it easier to manage applications where numerous libraries exist as dependencies. Although you will not use the Flask API library until later in the tutorial, follow these steps to install it now:
- Open the folder in Visual Studio Code.
- Create a new file.
- Name the file
requirements.txt
and add the following text:
flask
requests
pytube
- Save the file by clicking Ctrl-S on Windows, or Cmd-S if you are using MacOS.
- Return to the command or terminal window and perform the installation by using pip to run the following command:
pip install -r requirements.txt
This command will download and install the necessary libraries and their dependencies.
Step 3: Collect the Input URL and Extract Audio
The next step is to get the URL of the video from which the audio will be extracted. However, before doing that, you need to import the necessary libraries (i.e., pytube and OS). While the pytube library provides the facilities to download YouTube videos from the web, the OS library provides a portable way of using operating-system-dependent functionality:
from pytube import YouTube
import os
With the libraries imported, you’ll now get the url of the video to be downloaded from the user. To do this, use the input()
function to get the url from the user and the YouTube()
function as imported from the pytube library to save it as a variable for downloading:
yt = YouTube(str(input("Enter the URL of the video you want to download: \n>> ")))
Since the audio of the YouTube video is the focus of this tutorial, you’ll extract the audio from the variable you created in the previous block of code. To do this, you need to use the streams()
and filter()
methods while setting the only_audio
Boolean parameter as True
, signifying that only the audio should be extracted:
audio = yt.streams.filter(only_audio = True).first()
Step 4: Set a Destination for Saved Files
The next step is to determine the destination where the file will be saved. This can be done by creating a variable called destination that will hold the path to your video as a string. In this instance, there will be an option for the user to save the audio file in the same directory as the project:
print("Enter the destination (leave blank for current directory)")
destination = str(input(">> ")) or '.'
Step 5: Download and Save the File
The final step is to download and save the audio file:
https://user-c185237a-4958-4e1e-a972-57cc037976e7-agyqaaz4hq-uc.a.run.app/?p=237941b8f5
You can see from the line of code above that the output_path
parameter is set as the destination variable you created earlier to hold the file path for the audio file. You will then save the audio file as .mp3:
base, ext = os.path.splitext(out_file)
new_file = base + '.mp3'
os.rename(out_file, new_file)
With the video saved, you’ll then display a result of success for the user to know that the audio has been successfully downloaded:
print(yt.title + " has been successfully downloaded in .mp3 format.")
Step 6: Test in the Terminal
Next, you can test the program you’ve built on the terminal. To do this, go to the project directory in your terminal and run the command shown below:
python <file_name>.py
Step 7: Use Flask to Build a Web Application to Prioritize Functionality
Flask is a micro web framework written in Python that is used to develop web applications. In this tutorial, you’ll use Flask to create a web application that will take the format shown below.
Flask uses templates to expand the functionality of a web application while maintaining a simple and organized file structure. Templates are enabled using the Jinja2 template engine, which was installed when you downloaded Flask. Templates also allow data to be shared and processed before being turned into content and sent back to the client.
Now, create a folder called templates
in your project folder. In that folder, you’ll create two files – index.html
and results.html
. This will form the baseline for your web application’s frontend.
At this point, your project folder should look like this:
Add the following block of code to the index.html
file:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@4.5.3/dist/css/bootstrap.min.css"
integrity="sha384-TX8t27EcRE3e/ihU7zmQxVncDAy5uIKz4rEkgIXeMed4M0jlfIDPvg6uqKI2xXr2" crossorigin="anonymous">
<title>YouTube downloader</title>
</head>
<body>
<div class="container">
<h1>YouTube downloader as MP3</h1>
<div>Enter the URL of the video you want to download: </div>
<div>
<form method="POST">
<div class="form-group">
<textarea name="text" cols="20" rows="10" class="form-control"></textarea>
</div>
<button type="submit" class="btn btn-success">Download!</button>
</div>
</form>
</div>
</div>
</body>
</html>
Add this to the results.html
file:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@4.5.3/dist/css/bootstrap.min.css"
integrity="sha384-TX8t27EcRE3e/ihU7zmQxVncDAy5uIKz4rEkgIXeMed4M0jlfIDPvg6uqKI2xXr2" crossorigin="anonymous">
<title>Result</title>
</head>
<body>
<div class="container">
<h2>Results</h2>
<div>
<strong>MP3 file has been successfully downloaded in .mp3 format.</strong>
</div>
<div>
<a href="{{ url_for('index') }}">Try another one!</a>
</div>
</div>
</body>
</html>
Now, edit the app.py
file you created earlier to reflect the flask POST
and GET
methods. The file should look like this:
from flask import Flask, redirect, url_for, request, render_template, session
from pytube import YouTube
import os
import requests
app = Flask(__name__)
@app.route('/', methods=['GET'])
def index():
return render_template('index.html')
@app.route('/', methods=['POST'])
def index_post():
yt = request.form['text']
yt = YouTube(yt)
video = yt.streams.filter(only_audio = True).first()
destination = '.'
out_file = video.download(output_path = destination)
base, ext = os.path.splitext(out_file)
new_file = base + '.mp3'
os.rename(out_file, new_file)
return render_template('results.html')
At this point, with your web application created, you can test it. To do this, open your terminal and run the following command to set the Flask runtime to development. This implies that each time there is a change, the server will automatically reload:
# Windows
set FLASK_ENV=development
# Linux/macOS
export FLASK_ENV=development
Then, run the application!
flask run
After running the command above, open the application in a browser by navigating to http://localhost:5000. You should see the form displayed. Congratulations!
Conclusion
Finally, it’s important to note that this tutorial only touches on one of the features of pytube. Pytube as a Python module complements applications in a plethora of situations. For instance, apart from supporting the download of a YouTube video in audio or video format, pytube also enables the download of a complete YouTube playlist. Furthermore, the library integrates track caption support, and has the ability to capture thumbnail URLs.