LLMs

Jul 03, 2024

Power Advanced Coding Capabilities with Deepseek Code LLM

Deepseek Coder v2, available as an NVIDIA NIM microservice, enhances project-level coding and infilling tasks.

1 MIN READ

Jul 02, 2024

Addressing Hallucinations in Speech Synthesis LLMs with the NVIDIA NeMo T5-TTS Model

NVIDIA NeMo has released the T5-TTS model, a significant advancement in text-to-speech (TTS) technology. Based on large language models (LLMs), T5-TTS produces...

4 MIN READ

Jul 02, 2024

Achieving High Mixtral 8x7B Performance with NVIDIA H100 Tensor Core GPUs and TensorRT-LLM

As large language models (LLMs) continue to grow in size and complexity, the performance requirements for serving them quickly and cost-effectively continue to...

9 MIN READ

Jul 02, 2024

Advancing Security for Large Language Models with NVIDIA GPUs and Edgeless Systems

Edgeless Systems introduced Continuum AI, the first generative AI framework that keeps prompts encrypted at all times with confidential computing by combining...

6 MIN READ

Jul 02, 2024

Phi-3-Medium: Now Available on the NVIDIA API Catalog

Phi-3-Medium accelerates research with logic-rich features in both short (4K) and long (128K) context.

1 MIN READ

Jul 01, 2024

StarCoder2-15B: A Powerful LLM for Code Generation, Summarization, and Documentation

Trained on 600+ programming languages, StarCoder2-15B is now packaged as a NIM inference microservice available for free from the NVIDIA API catalog.

1 MIN READ

Jul 01, 2024

Google's New Gemma 2 Model Now Optimized and Available on NVIDIA API Catalog

Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.

1 MIN READ

Jul 01, 2024

Deploy GPU-Optimized AI Software with One Click Using Brev.dev and NVIDIA NGC Catalog

Brev.dev is making it easier to develop AI solutions by leveraging software libraries, frameworks, and Jupyter Notebooks on the NVIDIA NGC catalog. You can use...

7 MIN READ

Jun 28, 2024

Create RAG Applications Using NVIDIA NIM and Haystack on Kubernetes

Step-by-step guide to build robust, scalable RAG apps with Haystack and NVIDIA NIMs on Kubernetes.

1 MIN READ

Jun 28, 2024

Introducing DoRA, a High-Performing Alternative to LoRA for Fine-Tuning

Full fine-tuning (FT) is commonly employed to tailor general pretrained models for specific downstream tasks. To reduce the training cost, parameter-efficient...

6 MIN READ

Jun 18, 2024

Leverage Our Latest Open Models for Synthetic Data Generation with NVIDIA Nemotron-4 340B

Since the introduction and subsequent wide adoption of Large Language Models (LLMs) – data has been the lifeblood of businesses building accurate and safe AI...

9 MIN READ

Jun 17, 2024

Video: Talk to Your Supply Chain Data Using NVIDIA NIM

NVIDIA operates one of the largest and most complex supply chains in the world. The supercomputers we build connect tens of thousands of NVIDIA GPUs with...

2 MIN READ

Jun 14, 2024

Level Up Your Skills with Five New NVIDIA Technical Courses

With AI introducing an unprecedented pace of technological innovation, staying ahead means keeping your skills up to date. The NVIDIA Developer Program gives...

4 MIN READ

Jun 12, 2024

Introducing Grouped GEMM APIs in cuBLAS and More Performance Updates

The latest release of NVIDIA cuBLAS library, version 12.5, continues to deliver functionality and performance to deep learning (DL) and high-performance...

7 MIN READ

Jun 12, 2024

Demystifying AI Inference Deployments for Trillion Parameter Large Language Models

AI is transforming every industry, addressing grand human scientific challenges such as precision drug discovery and the development of autonomous vehicles, as...

14 MIN READ

Jun 12, 2024

NVIDIA Sets New Generative AI Performance and Scale Records in MLPerf Training v4.0

Generative AI models have a variety of uses, such as helping write computer code, crafting stories, composing music, generating images, producing videos, and...

11 MIN READ