Vulkan support #797

m0nsky · 2024-06-19T09:14:59Z

Based on #517 this PR adds support for the Vulkan backend. The codebase for the native library had changed quite a bit since that setup from february and I've rewritten the Vulkan API detection which now uses a regex to extract the Vulkan API version from vulkaninfo --summary. Tested on both Windows 10 and Ubuntu 22.04.

On Windows, when Vulkan is available, it will be used by default (unless CUDA is available)
On Linux, it requires vulkan-tools to be installed (sudo apt install vulkan-tools)

Both backends can be configured in the NativeLibraryConfig just like before:

NativeLibraryConfig
   .All
   .WithCuda(false)
   .WithVulkan(true)

I have tested the examples with both CUDA and Vulkan and it seems to work correctly.

Note:
Keep in mind that with llama.cpp commit 1debe72737ea131cb52975da3d53ed3a835df3a6 (which the LLamaSharp June 2024 binary update is based off) Vulkan currently crashes on multi-gpu setups (like mine) when SplitMode is set to GPUSplitMode.None which is the default. Setting this to GPUSplitMode.Layer works correctly. This has since then been fixed in this PR.

Submitting as draft for now as it depends on #795 to be merged first.

martindevans · 2024-06-19T13:58:51Z

Note: #795 has now been merged.

m0nsky · 2024-06-19T21:41:49Z

Now that both #795 and #799 have been merged, I have done some additional fixes/cleanups to the Vulkan build process. Because the llama_cpp_commit wasn't being taken in account, the binaries were causing a crash on a fresh build.

I've ran the updated build action on my local fork, tested both llama and llava using Vulkan on Windows 10, and both are working as they should.

I've done the same test on WSL Ubuntu 22.04 using Vulkan through llvmpipe, which also worked fine in both automatic backend selection and inference.

Should I remove the draft status for this PR?

martindevans · 2024-06-19T22:42:11Z

Should I remove the draft status for this PR?

Presumably this can't be merged until a binary update? If so we should probably leave it as a draft until I start that process.

m0nsky · 2024-06-20T08:38:25Z

Yup you're right. We do need the corrected llama_cpp_commit for the Vulkan build before the binary update, so I will split that up into a separate PR.

m0nsky added 2 commits June 19, 2024 10:54

Initial Vulkan set up

8c8dde9

Replaced old cublas/cuda mention with vulkan

7877f28

m0nsky added 3 commits June 19, 2024 21:27

Merge branch 'SciSharp:master' into vulkan-backend

165f802

Make sure we use the correct llama_cpp_commit for Vulkan

1eb8e1c

Remove duplicate cmake params (already defined in COMMON_DEFINE)

777df9c

m0nsky mentioned this pull request Jun 20, 2024

Add llama_cpp_commit to Vulkan build and remove double params #801

Merged

m0nsky added 6 commits June 20, 2024 18:13

Merge branch 'SciSharp:master' into vulkan-backend

a5f825b

Add llava & ggml to Vulkan nuspec

9037bea

Updated runtime targets for Vulkan (llava + ggml)

c954d26

Merge branch 'master' into vulkan-backend

77cd07a

Updated runtime targets for Vulkan (llava + ggml)

07c4f0e

Make Vulkan nuspec match the CUDA nuspec title/order

8c3a53b

m0nsky marked this pull request as ready for review July 11, 2024 20:27

martindevans approved these changes Jul 11, 2024

View reviewed changes

martindevans mentioned this pull request Jul 13, 2024

Add LLamaSharp.Backend.Vulkan #3 #517

Closed

martindevans merged commit e907146 into SciSharp:master Jul 13, 2024
6 checks passed

m0nsky deleted the vulkan-backend branch July 18, 2024 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulkan support #797

Vulkan support #797

m0nsky commented Jun 19, 2024 •

edited

Loading

martindevans commented Jun 19, 2024

m0nsky commented Jun 19, 2024 •

edited

Loading

martindevans commented Jun 19, 2024

m0nsky commented Jun 20, 2024

Vulkan support #797

Vulkan support #797

Conversation

m0nsky commented Jun 19, 2024 • edited Loading

martindevans commented Jun 19, 2024

m0nsky commented Jun 19, 2024 • edited Loading

martindevans commented Jun 19, 2024

m0nsky commented Jun 20, 2024

m0nsky commented Jun 19, 2024 •

edited

Loading

m0nsky commented Jun 19, 2024 •

edited

Loading