Author: Ashish Kumar PandaLast Updated: Mon, Aug 21, 2023
Generative Facial Prior-Generative Adversarial Network (GFPGAN) is a blind face restoration model used to restore and improve images. The model works with multiple image types such as old images, low-resolution, noisy, blurry, and AI-generated images. Generative Facial Prior (GFP) handles the face restoration process through novel channel-split spatial feature transform layers. This enables a balance between realness and fidelity on generated images.
This article explains how to perform AI Face Restoration using GFPGAN on a Vultr Cloud GPU server. You are to restore a single image from a base URL, generate, and export improved images to compare the AI face restoration result.
Before you begin:
Deploy a Ubuntu A100 Cloud GPU server with at least 10GB GPU VRAM
Create a non-root user with sudo privileges
Switch to the non-root sudo user
# su example_user
In this section, set up the server environment and install the necessary packages to run the GFPGAN model as described below.
Create a new directory to store generated images
$ mkdir -p /home/user/output_images
$ pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu118
The above command installs PyTorch with pre-built CUDA 11.8 libraries.
torchvision offer a platform to develop and train deep learning models, along with tools and pre-trained models for Computer Vision tasks to speed up the process of building vision-based applications. To download the latest version, visit the PyTorch Documentation.
Install Jupyter Notebook
$ pip3 install notebook
Allow the Jupyter Notebook port
8888 through the firewall
$ sudo ufw allow 8888
By default, UFW is active on Vultr servers. The above command allows incoming connections to the Jupyter Notebook port
Launch Jupyter Notebook
$ jupyter notebook --ip=0.0.0.0
The above command starts Jupyter Notebook and listens for incoming connections on all server interfaces with the
When successful, a random token displays in your command output as below:
[I 2023-08-10 12:57:52.455 ServerApp] Jupyter Server 2.7.0 is running at: [I 2023-08-10 12:57:52.455 ServerApp] http://HOSTNAME:8888/tree?token=73631c92ba278d265aedeb3b199bd4d48e5ef5b2eed0ae06 [I 2023-08-10 12:57:52.455 ServerApp] http://127.0.0.1:8888/tree?token=73631c92ba278d265aedeb3b199bd4d48e5ef5b2eed0ae06 [I 2023-08-10 12:57:52.455 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [W 2023-08-10 12:57:52.458 ServerApp] No web browser found: Error('could not locate runnable browser'). [C 2023-08-10 12:57:52.458 ServerApp] To access the server, open this file in a browser: file:///home/user/.local/share/jupyter/runtime/jpserver-67384-open.html Or copy and paste one of these URLs: http://ControlNet-Test2:8888/tree?token=73631c92ba278d265aedeb3b199bd4d48e5ef5b2eed0ae06 http://127.0.0.1:8888/tree?token=73631c92ba278d265aedeb3b199bd4d48e5ef5b2eed0ae06
If you receive an error when starting Jupyter Notebook. End your SSH connection, and start a new session to activate Jupyter Notebook on your server.
Using a web browser of your choice, append the generated token to your Server IP to access Jupyter Notebook
Within the Jupyter interface, click New, select Notebook from the dropdown list and create a new Python3 Kernel notebook.
In this section, use Jupyter Notebook to install necessary GFPGAN model dependencies as described in the steps below.
In a new notebook cell, install the required model libraries
!pip install realesrgan gfpgan basicsr
Click the run button on the main menu bar or press CTRL + ENTER to execute the above command
Import the model libraries
from gfpgan import GFPGANer from basicsr.archs.rrdbnet_arch import RRDBNet from realesrgan import RealESRGANer from IPython.display import display, Image import requests import numpy as np import cv2 import os
Below is what each imported library does:
GFPGANer: is a class within the
gfpgan package that used to perform the actual GFPGAN-based image super-resolution
RRDBNet: (Residual-in-Residual Dense Block Network) is a deep neural network architecture that's used in super-resolution algorithms and it's utilized by the GFPGAN model
RealESRGANer: Handles the image background information
display: allows you to display objects in the Jupyter Notebook output cell instead of a separate file or link
Image: allows you to create an Image object from various sources, such as a path, a URL, or raw image data
requests: Handles HTTP requests when fetching an image
numpy: Handles numerical computations with support for arrays and matrices
cv2: Performs computer vision tasks, such as image and video processing, object detection, and more.
os: creates a link to the operating system and handles operations related to the file system, environment variables, and other OS-specific tasks.
Run the command to import all required model libraries
Initialise and set up the GFPGAN model in your Jupyter Notebook file as described in the steps below.
arch = 'clean' model_name = 'GFPGANv1.4' gfpgan_checkpoint = 'https://github.com/TencentARC/GFPGAN/releases/download/v1.3.4/GFPGANv1.4.pth' realersgan_checkpoint = 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth'
The checkpoints defined in the above command are:
arch: As set to
clean, it's the clean version without custom-compiled CUDA extensions used in StyleGAN2
realersgan_checkpoint: Host a URL pointing to a checkpoint file for the GFPGAN and Real-ESRGAN model. The URL consists of
.pth files used to save and load the latest model weights
To get the latest version and update the above checkpoints, visit the official GFPGAN releases page.
RRDBNet for background enhancements
rrdbnet = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)
The following are the defined parameters:
num_in_ch: Specifies the number of input channels for the neural network.
num_in_ch=3 means that the network expects input images with three color channels, and represent the Red, Green, and Blue (RGB) channels
num_out_ch: Specifies the number of neural network output channels. Just like
num_out_ch=3 declaration represents that the network is to produce output images using RGB
num_feat: Sets the number of feature channels or filters used in the model to determine the depth or complexity of the network
num_block: Sets the number of RRDBs in the network. Increasing this value can make the network deeper and more powerful
num_grow_ch: Affects how the network scales and transforms information
scale: Sets the upscaling factor that the network can handle.
scale=2 represents that the network can upscale the input images by a factor of 2
RealESRGANer object for background enhancements
bg_upsampler = RealESRGANer( scale=2, model_path=realersgan_checkpoint, model=rrdbnet, tile=400, tile_pad=10, pre_pad=0, half=True )
The above command parameters handle the following functions:
model_path: Specifies the path to the checkpoint file
.pth that contains the pre-trained weights and parameters of the Real-ESRGAN model you defined earlier
model: Sets an instance of the RRDBNet object that represents the architecture used by the Real-ESRGAN algorithm.
tile: Specifies the size of tiles the input image divides to during the upsampling process. Splitting the image into tiles helps in handling large images that might not fit into memory in a batch.
tile=400 divides the image into tile sizes of 400x400 pixels
tile_pad: Determines the number of pixels added as padding around each tile before the upsampling process begins.
pre_pad: Specifies additional padding added to the input image before the upsampling process starts
half=True: Performs upsampling functions using the half-precision floating-point arithmetic. Using
False performs the upsampling using full-precision floating-point arithmetic
GFPGANer object for face restoration
restorer = GFPGANer( model_path=gfpgan_checkpoint, upscale=2, arch=arch, channel_multiplier=2, bg_upsampler=bg_upsampler )
The above parameters function as below:
model_path: Takes in the latest
gfpgan_checkpoint version you defined earlier
upscale: Specifies the upscaling factor for the super-resolution. It represents how much the resolution of the input image should increase
arch: Takes in the latest
arch version which you defined earlier
channel_multiplier: Controls the number of filters used in the GFPGAN model. Increasing this value makes the model better but also increases the GPU memory usage
Create a function to retrieve the image
def download_image(url): response = requests.get(url) image_array = np.frombuffer(response.content, dtype=np.uint8) image = cv2.imdecode(image_array, cv2.IMREAD_COLOR) return image
The above code declares a function named
download_image() with one argument which hosts an image URL. The declarations inside the
download_image() function include:
response: Uses the
requests.get() function to send an
HTTP GET request to the provided URL to download the image from the URL
image_array: Stores the binary image data in the
response.content attribute. It then uses NumPy's
np.frombuffer() function to convert the binary data into a NumPy array of unsigned 8-bit integers (
dtype=np.uint8). The array represents the image data in raw pixel format
image: Stores the
cv2.imdecode function to decode raw image data stored in
cv2.IMREAD_COLOR, and specifies that the image should be in color mode, meaning it's a 3-channel image (RGB)
Run the function to prepare the model to import an image URL as the input and perform face restoration tasks
To run the GFPGAN model, import an image from a direct URL with a readable file extension such as
.png. When ready, load the image as described in the following steps
Load the image using the
image = download_image('https://example.com/image.jpg')
For a quick addition to the model, upload your image to a public image hosting platform such as
imgur and copy the direct image address to import into the function.
Load the image in the
restorer function you defined earlier
cropped_faces, restored_faces, restored_img = restorer.enhance(image)
The restorer function breaks the image and stores it in three data types as below:
cropped_faces: Consists of cropped faces extracted from the input image
restored_faces: Consists of cropped faces when the restoration is complete
restored_img: Consists of the fully restored image
Run the following function to load the processed image. Replace
.jpg with your actual source image extension
def cv2_imshow(image): _, encoded_image = cv2.imencode('.jpg', image) display(Image(data=encoded_image.tobytes()))
The above code creates the function
cv2_imshow() with a single argument that consists of the image URL. The remaining parameters are:
_, encoded_image = cv2.imencode('.jpg', image): Uses
cv2.imencode from the OpenCV library to encode the input image in a format defined by the user. The function returns two values, a boolean value that shows whether the encoding is successful, and encoded image data in the form of a NumPy array
IPython.display.Image to create an image object from the encoded image data and displays it in the output cell of the Jupyter Notebook using
tobytes() method converts the NumPy array containing the encoded image into a bytes object for the Image class.
To view the final images, run the following commands. Replace
i with your desired array index value such as
2 depending on the number of faces in your image
#display original cropped faces cv2_imshow(cropped_faces[i]) #display restored cropped faces cv2_imshow(restored_faces[i]) #display restored image cv2_imshow(restored_img)
As applied in the above command, faces in the image are available in form of an array. Therefore every face has an index value (0,1,2,3 and so on). Set the number
i depending on the face you want to view and the total number of faces in the image.
To view the final generated image, run the following command
Verify that the generated image contains notable improvements as compared to your original image.
To save the final generated image to the output directory you created earlier, define the directory and filename as below
output_directory = "/home/user/output_images/" output_filename = "image.jpg" output_filepath = os.path.join(output_directory, output_filename) if not os.path.exists(output_directory): os.makedirs(output_directory)
The above code block consists of:
output_directory: The location to store your final image
output_filename: The name to save your exported image as
output_filepath: Consists of The
os.path.join() function from the os module used to combine the
output_filename into a single string representing the complete path of the file to store the restored image
os.path.exists: Checks if the output path exists, else, it creates the path.
Save the image
The above command returns
True when the image saves successfully to the directory, and below is what the function does:
cv2.imwrite(): Consists of a function from the
cv2library used to save the restored image to the specified
output_filepath. You can replace
restored_faces[i], or keep
restored_imgdepending on the target image you intend to save.
To download a copy of the generated images, launch a new terminal session and establish an SFTP connection to the server
$ sftp user@SERVER-IP
When logged in, Switch to the
output_images directory in your user home
sftp> cd output_images
View available files in the directory
Download an image file of your choice. For example, to download the
image.jpg file, run the following command
sftp> get image.jpeg
When the download is complete, close the SFTP connection
Using your local computer's file explorer, find the image in your terminal's working directory
In your Jupyter Notebook session, run the following command to view the GPU memory usage statistics
In Half-bit, your output should look like the one below:
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 0 0 50737 C /usr/bin/python3 3252MiB | +-----------------------------------------------------------------------------+
As displayed in the output, with half-bit precision, the model uses about 3 GB of VRAM.
With full-bit precision:
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 0 0 50855 C /usr/bin/python3 4492MiB | +-----------------------------------------------------------------------------+
In full-bit precision, the model uses 4.5 GB of VRAM.
In this article, you implemented the GFPGAN model on a Vultr Cloud GPU server. You prepared the server, installed dependency libraries, and used the model through Jupyter Notebook. For more information on how to use GFPGAN, visit the official model repository.
To implement more solutions on your Vultr Cloud GPU server, visit the following resources: