Unparalleled AI and graphics performance for the data center

The NVIDIA L40S GPU, based on the Ada Lovelace architecture, is the most powerful universal GPU for the data center, delivering breakthrough multi-workload acceleration for large language model (LLM) inference and training, graphics, and video applications.

no form fill or personal details required for access
Data sheet front page

Universal Performance

Tensor performance

1,466 TFLOPS1

RT Core performance

212 TFLOPS

Single-precision performance

91.6 TFLOPS

1Peak rates are based on GPU boost clock.

Get the right configuration for you

With Vultr Cloud GPU, accelerated by NVIDIA’s computing platform, the NVIDIA L40S GPU can be harnessed through GPU passthrough or as an 8-GPU bare-metal server. Get up to speed quickly powered by Vultr GPU Stack, or enjoy the flexibility of direct access to NVIDIA L40S GPUs through GPU passthrough or bare-metal. Experience greater control and the ability to supply your own drivers for maximum software compatibility.

Powered by the NVIDIA
Ada Lovelace architecture

Fourth-generation Tensor Cores

Hardware support for structural sparsity and optimized TF32 format provides out of-the-box performance gains for faster AI and data science model training. Accelerate AI-enhanced graphics capabilities with DLSS to upscale resolution with better performance in select applications.

Third-generation RT Cores

Enhanced throughput and concurrent ray-tracing and shading capabilities improve ray-tracing performance, accelerating renders for product design and architecture, engineering, and construction workflows. See lifelike designs in action with hardware-accelerated motion blur and stunning real-time animations.

CUDA Cores

Accelerated single-precision floating point (FP32) throughput and improved power efficiency significantly boost performance for workflows like 3D model development and computer-aided engineering (CAE) simulation. Use enhanced 16-bit math capabilities (BF16) for mixed-precision workloads.

Transformer Engine

Transformer Engine dramatically accelerates AI performance and improves memory utilization for both training and inference. Harnessing the power of the Ada Lovelace fourth-generation Tensor Cores, Transformer Engine intelligently scans the layers of transformer architecture neural networks and automatically recasts between FP8 and FP16 precisions to deliver faster AI performance and accelerate training and inference.

Efficiency and security

L40S GPU is optimized for 24/7 enterprise data center operations and designed, built, tested, and supported by NVIDIA to ensure maximum performance, durability, and uptime. The L40S GPU meets the latest data center standards, are Network Equipment-Building System (NEBS) Level 3 ready, and features secure boot with root of trust technology, providing an additional layer of security for data centers.

DLSS 3

L40S GPU enables ultra-fast rendering and smoother frame rates with NVIDIA DLSS 3. This breakthrough frame-generation technology leverages deep learning and the latest hardware innovations within the Ada Lovelace architecture and the L40S GPU, including fourth-generation Tensor Cores and an Optical Flow Accelerator, to boost rendering performance, deliver higher frames per second (FPS), and significantly improve latency.

NVIDIA L40S
Specifications

FP32 91.6 teraFLOPS
TF32 Tensor Core 366 teraFLOPS*
FP16 733 teraFLOPS*
FP8 1,466 teraFLOPS*
RT Core Performance 212 teraFLOPS*
Max Power Consumption 350 W
*With Sparsity

Get started, or get some advice

Start your GPU-accelerated project now by signing up for a free Vultr account.
Or, if you’d like to speak with us regarding your needs, please reach out.