Hopper

Jul 02, 2024

Achieving High Mixtral 8x7B Performance with NVIDIA H100 Tensor Core GPUs and TensorRT-LLM

As large language models (LLMs) continue to grow in size and complexity, the performance requirements for serving them quickly and cost-effectively continue to...

9 MIN READ

Jul 01, 2024

How Cutting-Edge Computer Chips are Speeding Up the AI Revolution

Featured in Nature, this post delves into how GPUs and other advanced technologies are meeting the computational challenges posed by AI.

1 MIN READ

Jun 12, 2024

Introducing Grouped GEMM APIs in cuBLAS and More Performance Updates

The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...

7 MIN READ

Jun 12, 2024

NVIDIA Sets New Generative AI Performance and Scale Records in MLPerf Training v4.0

Generative AI models have a variety of uses, such as helping write computer code, crafting stories, composing music, generating images, producing videos, and...

11 MIN READ

Apr 25, 2024

Announcing Confidential Computing General Access on NVIDIA H100 Tensor Core GPUs

NVIDIA launched the initial release of the Confidential Computing (CC) solution in private preview for early access in July 2023 through NVIDIA LaunchPad....

3 MIN READ

An image of an NVIDIA H200 Tensor Core GPU.

Mar 27, 2024

NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records

Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI...

11 MIN READ

Mar 25, 2024

Building High-Performance Applications in the Era of Accelerated Computing

AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements...

6 MIN READ

Mar 06, 2024

How to Accelerate Quantitative Finance with ISO C++ Standard Parallelism

Quantitative finance libraries are software packages that consist of mathematical, statistical, and, more recently, machine learning models designed for use in...

10 MIN READ

Dec 18, 2023

Deploying Retrieval-Augmented Generation Applications on NVIDIA GH200 Delivers Accelerated Performance

Large language model (LLM) applications are essential in enhancing productivity across industries through natural language. However, their effectiveness is...

10 MIN READ

Dec 14, 2023

Achieving Top Inference Performance with the NVIDIA H100 Tensor Core GPU and NVIDIA TensorRT-LLM

Best-in-class AI performance requires an efficient parallel computing architecture, a productive tool stack, and deeply optimized algorithms. NVIDIA released...

4 MIN READ

An illustration showing the steps "LLM" then "Optimize" then "Deploy."

Dec 04, 2023

NVIDIA TensorRT-LLM Enhancements Deliver Massive Large Language Model Speedups on NVIDIA H200

Large language models (LLMs) have seen dramatic growth over the last year, and the challenge of delivering great user experiences depends on both high-compute...

5 MIN READ

Illustration representing NeMo Framework.

Dec 04, 2023

New NVIDIA NeMo Framework Features and NVIDIA H200 Supercharge LLM Training Performance and Versatility

The rapid growth in the size, complexity, and diversity of large language models (LLMs) continues to drive an insatiable need for AI training performance....

9 MIN READ

Nov 28, 2023

One Giant Superchip for LLMs, Recommenders, and GNNs: Introducing NVIDIA GH200 NVL32

At AWS re:Invent 2023, AWS and NVIDIA announced that AWS will be the first cloud provider to offer NVIDIA GH200 Grace Hopper Superchips interconnected with...

9 MIN READ

An illustration representing HPC applications.

Nov 16, 2023

Unlock the Power of NVIDIA Grace and NVIDIA Hopper Architectures with Foundational HPC Software

High-performance computing (HPC) powers applications in simulation and modeling, healthcare and life sciences, industry and engineering, and more. In the modern...

7 MIN READ

Nov 13, 2023

Simplifying GPU Programming for HPC with NVIDIA Grace Hopper Superchip

The new hardware developments in NVIDIA Grace Hopper Superchip systems enable some dramatic changes to the way developers approach GPU programming. Most...

17 MIN READ

Nov 08, 2023

Setting New Records at Data Center Scale Using NVIDIA H100 GPUs and NVIDIA Quantum-2 InfiniBand

Generative AI is rapidly transforming computing, unlocking new use cases and turbocharging existing ones. Large language models (LLMs), such as OpenAI’s GPT...

19 MIN READ