Questions tagged [cuda]

Ask Question

CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model for NVIDIA GPUs (Graphics Processing Units). CUDA provides an interface to NVIDIA GPUs through a variety of programming languages, libraries, and APIs.

14,508 questions

0 votes

1 answer

15 views

Hide warnings and/or errors when importing Keras

My script imports the following Keras modules: from keras.models import Sequential from keras.layers import Dense, Input from keras.utils import to_categorical and every time the same warnings/errors ...

Gabriel

42.1k

asked 30 mins ago

-3 votes

0 answers

14 views

i getting a KeyError in numba but the Key exists [duplicate]

i getting KeyError in this code: from numba import jit, cuda import numpy as np from timeit import default_timer as timer @jit(target_backend='cuda') def func2(a): for ...

LiogamerYT

asked 3 hours ago

-1 votes

0 answers

27 views

Contradict specs on tensor cores on my GPU [duplicate]

My GPU is Quadro T1000 Mobile (SM_75). I've received the contrary device specs on tensor cores. The GPU has 14 SMs and the chapter compute capacity 7.x lists 8 tensor cores per SM straightly. If so, ...

sof

9,509

asked yesterday

-2 votes

0 answers

12 views

CUDA Issues Kubeflow

In our company we have Kubeflow running with GPUs available. I'm using a standard docker image jupyter-pytorch-cuda-full:v1.8.0 as base image. torch.version = 2.1.0+cu121 is installed, the GPU is ...

Romero Azzalini

asked yesterday

0 votes

0 answers

22 views

Compiling CUDA programs with clang takes over an hour [closed]

I am using clang-18 to compile CUDA programs, and the compilation process does not report any errors, but it takes a very long time (even over an hour). The program can be compiled very quickly using ...

putong

asked yesterday

0 votes

0 answers

21 views

Cannot open source file "crtdefs.h" in VSC (CUDA script), but CUDA compilation works

My CUDA script (.cu) can be compiled without error, but #include <stdio.h> line raises VSC's error: #include errors detected. Please update your includePath. Squiggles are disabled for this ...

TaihouKai

asked yesterday

0 votes

1 answer

24 views

How to partition data in a warp based on a predicate so all keep items are consecutive

I have a warp full of data, some of which I want to keep and some I want to discard. I want to store the keep items in contiguous memory. For example, say I only want to keep prime numbers input ...

Johan

75.5k

asked yesterday

0 votes

1 answer

31 views

cuobjdump emit no PTX arithmetic instruction

Why doesn't cuobjdump emit the PTX mul instruction below? Has nvcc optimized the cubin output iteself? Is the result calculated at compile-time? If so, for this simplest case nvcc can reasonably ...

sof

9,509

asked yesterday

0 votes

1 answer

36 views

Build issue with MatX concerning initialisation of shared variables

I'm attempting to build and install MatX onto my Linux machine. I'm following the instructions found here. Except when I run the make -j command, I get the following trace: /home/<me>/Documents/...

Hugo Phibbs

asked 2 days ago

0 votes

1 answer

32 views

Calculate network from cugraph

I have been playing around with cugraph and nx_cugraph in python, but I am struggling to calculate the number of connected components from the graph. I have been getting a lot of errors. To calculate ...

Tan Linh

asked 2 days ago

0 votes

1 answer

59 views

Problems evaluating CUDNN for SGEMM

I used cudnn to test sgemm for C[stride x stride] = A[stride x stride] x B[stride x stride] below, Configuration GPU: T1000/SM_75 cuda-12.0.1/driver-535 installed (via the multiverse repos on ubuntu-...

sof

9,509

asked Jul 17 at 10:08

0 votes

0 answers

32 views

CUDA Thrust Sort Error C2338: ‘unimplemented for this system’ in Visual Studio 2022 after Git Pull [closed]

I'm facing an issue with a CUDA project that was previously compiling and running successfully. After pulling the latest code from GitLab, I'm now encountering a static_assert error from the Thrust ...

Tang SuKai

asked Jul 17 at 9:29

2 votes

1 answer

32 views

CUDA: Nth set bit indexes using all threads in a warp in O(1) time

I have a 32-bit bit mask holding a set of valid items. From that bit mask I want to extract the indices of valid entries as a list. Let's say I obtained the bit mask using a ballot, and I want to know ...

Johan

75.5k

asked Jul 17 at 8:06

-4 votes

0 answers

30 views

How to build and use Nvidia cuCollections on Windows? [closed]

Is there a way to make this work on windows? https://github.com/NVIDIA/cuCollections I am unable to compile it and unable to use the .cuh files as a part of my project. The bottom line is that the lib ...

realPro

1,759

asked Jul 17 at 6:41

-2 votes

0 answers

24 views

Want to run a Local LLM on Nvidia Jetson AGX Orin over GPU

I am looking to run a local LLM (Large Language Model) on an Nvidia Jetson AGX Orin over the GPU CUDA Cores . Could anyone provide guidance or share resources on how to achieve this? Thank you in ...

Mausam Jain

asked Jul 17 at 3:55

15 30 50 per page

2 3 4 5

…

968 Next

Collectives™ on Stack Overflow

Questions tagged [cuda]

Hide warnings and/or errors when importing Keras

i getting a KeyError in numba but the Key exists [duplicate]

Contradict specs on tensor cores on my GPU [duplicate]

CUDA Issues Kubeflow

Compiling CUDA programs with clang takes over an hour [closed]

Cannot open source file "crtdefs.h" in VSC (CUDA script), but CUDA compilation works

How to partition data in a warp based on a predicate so all keep items are consecutive

cuobjdump emit no PTX arithmetic instruction

Build issue with MatX concerning initialisation of shared variables

Calculate network from cugraph

Problems evaluating CUDNN for SGEMM

CUDA Thrust Sort Error C2338: ‘unimplemented for this system’ in Visual Studio 2022 after Git Pull [closed]

CUDA: Nth set bit indexes using all threads in a warp in O(1) time

How to build and use Nvidia cuCollections on Windows? [closed]

Want to run a Local LLM on Nvidia Jetson AGX Orin over GPU

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [cuda]

Related Tags