Author: Mirdul Swarup
Last Updated: Mon, Oct 9, 2023Vultr GPU Stack is designed to streamline the process of building Artificial Intelligence (AI) and Machine Learning (ML) projects by providing a comprehensive suite of pre-installed software, including NVIDIA CUDA Toolkit, NVIDIA cuDNN, Tensorflow, PyTorch and so on.
It reduces the time required to set up the server before you can use it for operations like building, fine tuning or to infer a model. We ensure that the pre-installed softwares are tested on our infrastructure and are reliable for all your AI/ML development needs.
NVIDIA GPU Drivers: They ensure to enable your computer to utilize the NVIDIA GPUs making them function properly
NVIDIA CUDA Toolkit: It is also a set of programming tools and libraries to utilize the potential of NVIDIA GPUs, allowing users to speed up computation and parallel processing tasks
NVIDIA cuDNN: It is a GPU-accelerated library designed for deep neural networks, optimizing the performance of deep learning frameworks like TensorFlow and PyTorch
Tensorflow: It is a machine learning framework used for building and training deep learning models, including neural networks
PyTorch: It is also a machine learning framework for dynamic computation graph, research and prototyping in deep learning
JupyterLab: It is a web-based interactive development environment for creating and running Jupyter notebooks, commonly used in data science and machine learning experimentation
Docker: It is a platform for developing, shipping, and running applications inside containers
Choose the Compute menu item on the products page
Click Deploy Server
Select a server type
Select a GPU type according to the specific use case
Select a server location
Select Vultr GPU Stack as the operating system
Select a server size according to the specific use case
Choose additional features as required
Deploy the server
Retrieve the server details
Copy and Paste the server IP and password to log in with SSH
Check the status of the JupyterLab service
# systemctl status jupyterlab-lab
The above command gives the information on the current state of the jupyterlab-lab
service which helps to determine if the service is active or if there are issues associated with it
Output
â jupyterlab-lab.service - JupyterLab Lab
Loaded: loaded (/lib/systemd/system/jupyterlab-lab.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2023-09-28 11:53:55 UTC; 48min ago
Optional: Edit the JupyterLab Configuration
# nano /home/jupyter/.jupyter/jupyter_lab_config.py
Optional: Restart the JupyterLab Service
# systemctl restart jupyterlab-lab
It is necessary to restart the service for the changes made in the configuration to be applied
Get the access token
# cat /var/log/jupyterlab/lab.log
The above command outputs the token to access the pre-installed JupyterLab
Output
To access the server, open this file in a browser:
file:///home/jupyter/.local/share/jupyter/runtime/jpserver-1989-open.html
Or copy and paste one of these URLs:
http://localhost:9998/lab?token=6eebc299236fc1debe7b0a8e7bb8000169abcd9e8821df22
http://127.0.0.1:9998/lab?token=6eebc299236fc1debe7b0a8e7bb8000169abcd9e8821df22
To access the JupyterLab in your browser
https://YOUR_SERVER_IP:8888/lab?token=YOUR_TOKEN
Make sure to replace YOUR_SERVER_IP
with an actual IP address and YOUR_TOKEN
with the actual token
Upon accessing JupyterLab, you will face a security warning that is due to a self-signed SSL certificate. You can just click Advanced
and continue to access JuptyerLab
Create a Notebook, and run the following script
import torch
import tensorflow as tf
check = torch.cuda.is_available()
print("is cuda available = ",check)
try:
print("PyTorch is installed.")
print("PyTorch version:", torch.__version__)
except ImportError:
print("PyTorch is not installed.")
try:
print("TensorFlow is installed.")
print("TensorFlow version:", tf.__version__)
except ImportError:
print("TensorFlow is not installed.")
try:
cuda_version = torch.version.cuda
print(f"CUDA toolkit is installed. Version: {cuda_version}")
except ImportError:
print("CUDA Toolkit is not installed.")
try:
cudnn_version = torch.backends.cudnn.version()
print(f"cuDNN is installed. Version: {cudnn_version}")
except AttributeError:
print("cuDNN is not available.")
The above script outputs the status of CUDA availability and the pre-installed versions of TensorFlow, PyTorch, cuDNN, and CUDA Toolkit.
Perform matrix operation
matrix_a = torch.tensor([[1, 2], [3, 4]])
matrix_b = torch.tensor([[4, 3], [2, 1]])
result_pytorch = torch.mm(matrix_a, matrix_b)
print("Matrix multiplication result using PyTorch:")
print(result_pytorch)
matrix_a = tf.constant([[1, 2], [3, 4]])
matrix_b = tf.constant([[4, 3], [2, 1]])
result_tensorflow = tf.matmul(matrix_a, matrix_b)
print("Matrix multiplication result using TensorFlow:")
print(result_tensorflow.numpy())
The above script performs matrix multiplication using TensorFlow and PyTorch to check if there are no issues with the pre-installed versions and if they are working as intended.
By running a GPU-accelerated container it becomes convenient to scale GPU-accelerated applications while enabling portability and resource management.
Pull the official pytorch
image
# docker pull pytorch/pytorch
Run the container
# docker run --gpus all -it --rm pytorch/pytorch
In the above command --gpus all
instructs Docker to assign all available GPUs to the container, ensuring that the GPU can utilize GPU resources. -it
instructs Docker to remove the container once it exits
Upon the execution of the command, you will be inside the container to perform all actions
Access the Python console
# python3
Check the GPU availability
>>> import torch
>>> torch.cuda.is_available()
Output
True
Run the following script on the Python terminal
import torch
import time
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
matrix_size = 10000
matrix_a = torch.randn(matrix_size, matrix_size)
matrix_b = torch.randn(matrix_size, matrix_size)
matrix_a_gpu = matrix_a.to(device)
matrix_b_gpu = matrix_b.to(device)
start_time = time.time()
result_cpu = matrix_a * matrix_b
cpu_execution_time = time.time() - start_time
## Warm up
for i in range(5):
_ = matrix_a_gpu * matrix_b_gpu
start_time = time.time()
result_gpu = matrix_a_gpu * matrix_b_gpu
gpu_execution_time = time.time() - start_time
speedup = cpu_execution_time / gpu_execution_time
print("Matrix size:", matrix_size, "x", matrix_size)
print("CPU Execution Time:", cpu_execution_time, "seconds")
print("GPU Execution Time:", gpu_execution_time, "seconds")
print("Speedup:", speedup)
The above script creates two random matrices of the mentioned size and calculates the matrix multiplication time on both CPU and GPU, then the actual speed difference achieved by the GPU is calculated
You can find the Vultr GPU Stack option under the list of Operating Systems while deploying a new server, you can also select the version of base Ubuntu image.
Vultr GPU Stack is compatible with Cloud GPU servers and Bare Metal servers that are equipped with GPU(s).
Vultr GPU Stack comes with the following pre-installed softwares:
NVIDIA GPU Driver
NVIDIA CUDA Toolkit
NVIDIA cuDNN
Tensorflow
PyTorch
JupyterLab
Docker with NVIDIA Container Toolkit
Yes, you can deploy a server with Vultr GPU Stack image using Terraform or Vultr API using the correct os_id
that you can find using List OS API endpoint.
You can find the token to log into JupyterLab interface in system logs located in the /var/log/jupyter
directory.
You can overwrite the JupyterLab configuration by editing the /home/jupyter/.jupyter/jupyter_lab_config.py
file and restarting the jupyterlab-lab
service.
Yes, you can upload your existing notebooks by accessing the JupyterLab interface or placing it in the default notebooks directory.
By default, the Jupyter notebooks are located in /home/jupyter/notebooks
directory.
You walked through the steps to deploy a Cloud GPU server with Vultr GPU Stack for AI/ML development/deployment. It comes with essential pre-installed softwares that are tested on our infrastructure for compatiblity and reliablity. The packaged software help reduce the time to configure the server before you're able to perform any action.