A Graphical Processing Unit (GPU) is specialized hardware initially designed for computer graphics and image processing. Their highly parallel structure makes them more efficient than general-purpose Central Processing Units (CPUs) for algorithms that process large blocks of data in parallel. Traditionally, you need to install a server on-premise with one or more GPUs to access this power. This is expensive and inflexible, but there is an alternative.
If you'd like to jump right in, see the Vultr Talon Cloud GPU Quickstart, which explains how to deploy a Cloud GPU or GPU Compute Marketplace App.
Working in close collaboration with NVIDIA, we developed the Vultr Talon platform powered by NVIDIA GPUs and NVIDIA AI Enterprise software. Instead of attaching an entire physical GPU to a cloud server, we attach a fraction in the form of a virtual GPU (vGPU) to create a new instance type: the Cloud GPU.
When you deploy a Vultr Cloud GPU instance, there's no need to hassle with driver installation or license issues. You can skip all those steps and run an NVIDIA GPU-powered application in minutes. You can choose a GPU fraction for your workload and budget and then scale that up or down as needed. Cloud GPUs are ideal for a variety of cloud applications like Big Data applications, Virtual Desktop Infrastructure (VDI), Machine Learning (ML), Artificial Intelligence (AI), High-Performance Computing (HPC), video encoding, cloud gaming solutions, general-purpose computing with CUDA, graphics rendering, and more.
Vultr pre-installs everything you need to get started. Our Cloud GPUs come with licensed NVIDIA drivers, the CUDA Toolkit, and CUDA Deep Neural Network (cuDNN) library. If you want a custom operating system that isn't in our library, just install cloud-init, and we'll automatically install all those components for you.
Follow the steps in our Cloud GPU Quickstart, and in a couple of minutes, you'll have a Cloud GPU instance ready to use, with low per-hour billing and no long-term commitments.
Dedicated GPUs are expensive and often underutilized, but you can use Cloud GPUs to match your workloads to the processing power you need, saving you time and expense. In addition, Cloud GPUs come in affordable fractions ranging from 1/20th of a card up to a fully-dedicated NVIDIA A100 GPU. Our Bare Metal servers can even be configured with multiple cards.
Cloud GPUs are ideal for training models on smaller subsets of your data and then ramping up to full performance later. For example, you might use a scenario like this:
Traditionally, GPU applications require an on-premise server or a cloud server with a fully-dedicated GPU running in passthrough mode. Unfortunately, those solutions cost thousands of dollars per month. Vultr offers an alternative: Cloud GPU instances partitioned into virtual GPUs (vGPUs), which allows you to pick the performance level that matches your workload and budget. vGPUs are powered by NVIDIA AI Enterprise, which presents your server instance with a vGPU that looks just like a physical GPU. Each vGPU has its own dedicated memory slice, and a corresponding portion of the physical GPU compute power. vGPUs run all the same frameworks, libraries, and operating systems as a physical GPU.
Vultr offers two types of vGPU partitioning: vGPU temporal partitioning and Multi-instance GPU (MIG) spacial partitioning.
Vultr offers both Cloud GPU and Bare Metal options.
Both GPU server types use NVIDIA A100 processors, the flagship data-center GPU for deep learning, data analytics, and HPC. A100-equipped Cloud GPUs have third-generation Tensor Cores and Ampere Architecture CUDA Cores that accelerate over 700 HPC applications and every deep learning framework. However, because the A100 is designed for AI and HPC compute workloads, it does not include RT Cores for ray tracing acceleration and isn't intended for VDI or video encoding applications.
Our Cloud GPUs come in a range of memory and compute configurations.
Our A100-equipped Cloud GPUs have NVIDIA cuDNN library pre-installed for GPU-accelerated routines such as forward and backward convolution, pooling, normalization, and activation layers. Some popular use cases include:
A sub-category of AI and ML is Computer Vision, which trains convolutional neural networks (CNNs) to recognize objects in images. Computer vision is useful for process automation, health care, self-driving cars, image classification, and more.
Cloud GPUs accelerate ML, AI, and Computer Vision tasks with frameworks like Apache MXNet, MATLAB, TensorFlow, PyTorch, and Theano.
Big Data extracts meaningful insights from large, complex datasets. Big data is characterized by the "three Vs": Volume, Velocity, and Variety. Volume is usually measured in terabytes, petabytes, or even exabytes. The data is created, read, moved, and analyzed at high velocity, such as on social media platforms. And it consists of a variety of data formats such as photos, video, audio, and complex documents.
Cloud GPUs process big data with tools like Hadoop to manage large distributed data sets, Apache Spark for fast, iterative processing of Hadoop data, or Apache Storm for realtime computation of unbounded data streams.
CUDA (Compute Unified Device Architecture) is an API designed to use a GPU for general-purpose computing, with languages like C, C++, and Fortran. To learn more about how to use CUDA with a Vultr Cloud GPU, please see Introduction to CUDA C and C++ and Tuning CUDA Applications for NVIDIA Ampere GPU Architecture at the NVIDIA Developer Zone.